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Preface 


During the latter part of the seventeenth century the new mathe- 
matical analysis emerged as the dominating force in mathematics. 
It is characterized by the amazingly successful operation with infinite 
processes or limits. Two of these processes, differentiation and inte- 
gration, became the core of the systematic Differential and Integral 
Calculus, often simply called ‘‘Calculus,”’ basic for all of analysis. 

The importance of the new discoveries and methods was immediately 
felt and caused profound intellectual excitement. Yet, to gain mastery 
of the powerful art appeared at first a formidable task, for the avail- 
able publications were scanty, unsystematic, and often lacking in 
clarity. Thus, it was fortunate indeed for mathematics and science 
in general that leaders in the new movement soon recognized the 
vital need for writing textbooks aimed at making the subject ac- 
cessible to a public much larger than the very small intellectual elite of 
the early days. One of the greatest mathematicians of modern times, 
Leonard Euler, established in introductory books a firm tradition and 
these books of the eighteenth century have remained sources of inspira- 
tion until today, even though much progress has been made in the 
clarification and simplification of the material. 

After Euler, one author after the other adhered to the separation of 
differential calculus from integral calculus, thereby obscuring a key 
point, the reciprocity between differentiation and integration. Only in 
1927 when the first edition of R. Courant’s German Vorlesungen über 
Differential und Integralrechnung, appeared in the Springer-Verlag 
was this separation eliminated and the calculus presented as a unified 
subject. 

From that German book and its subsequent editions the present 
work originated. With the cooperation of James and Virginia McShaue 
a greatly expanded and modified English edition of the “Calculus” wes 
prepared and published by Blackie and Sons in Glasgow since 1934, and 
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distributed in the United States in numerous reprintings by Inter- 
science- Wiley. 

During the years it became apparent that the need of college and uni- 
versity instruction in the United States made a rewriting of this work 
desirable. Yet, it seemed unwise to tamper with the original versions 
which have remained and still are viable. 

Instead of trying to remodel the existing work it seemed preferable to 
supplement it by an essentially new book in many ways related to the 
European originals but more specifically directed at the needs of the 
present and future students in the United States. Such a plan became 
feasible when Fritz John, who had already greatly helped in the prepara- 
tion of the first English edition, agreed to write the new book together 
with R. Courant. 

While it differs markedly in form and content from the original, it is 
animated by the same intention: To lead the student directly to the 
heart of the subject and to prepare him for active application of his 
knowledge. It avoids the dogmatic style which conceals the motivation 
and the roots of the calculus in intuitive reality. To exhibit the interac- 
tion between mathematical analysis and its various applications and to 
emphasize the role of intuition remains an important aim of this new 
book. Somewhat strengthened precision does not, as we hope, inter- 
fere with this aim. 

Mathematics presented as a closed, linearly ordered, system of truths 
without reference to origin and purpose has its charm and satisfies a 
philosophical need. But the attitude of introverted science is unsuitable 
for students who seek intellectual independence rather than indoctrina- 
tion; disregard for applications and intuition leads to isolation and 
atrophy of mathematics. It seems extremely important that students 
and instructors should be protected from smug purism. 

The book is addressed to students on various levels, to mathema- 
ticians, scientists, engineers. It does not pretend to make the subject 
easy by glossing over difficulties, but rather tries to help the genuinely 
interested reader by throwing light on the interconnections and purposes 
of the whole. 

Instead of obstructing the access to the wealth of facts by lengthy 
discussions of a fundamental nature we have sometimes postponed such 
discussions to appendices in the various chapters. 

Numerous examples and problems are given at the end of various 
chapters. Some are challenging, some are even difficult; most of them 
supplement the material in the text. In an additional pamphlet more 
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problems and exercises of a routine character will be collected, and 
moreover, answers or hints for the solutions will be given. 

Many colleagues and friends have been helpful. Albert A. Blank 
not only greatly contributed incisive and constructive criticism, but he 
also played a major role in ordering, augmenting, and sifting of the 
problems and exercises, and moreover he assumed the main responsi- 
bility for the pamphlet. Alan Solomon helped most unselfishly and 
effectively in all phases of the preparation of the book. Thanks is also 
due to Charlotte John, Anneli Lax, R. Richtmyer, and other friends, 
including James and Virginia McShane. 

The first volume is concerned primarily with functions of a single 
variable, whereas the second volume will discuss the more ramified 
theories of calculus for functions of several variables. 

A final remark should be addressed to the student reader. It might 
prove frustrating to attempt mastery of the subject by studying such a 
book page by page following an even path. Only by selecting shortcuts 
first and returning time and again to the same questions and difficulties 
can one gradually attain a better understanding from a more elevated 
point. 

An attempt was made to assist users of the book by marking with an 
asterisk some passages which might impede the reader at his first at- 
tempt. Also some of the more difficult problems are marked by an 
asterisk. 

We hope that the work in the present new form will be useful to the 
young generation of scientists. We are aware of many imperfections 
and we sincerely invite critical comment which might be helpful for later 
improvements. 


Richard Courant 
Fritz John 
June 1965 
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Introduction 


Since antiquity the intuitive notions of continuous change, growth, 
and motion, have challenged scientific minds. Yet, the way to the 
understanding of continuous variation was opened only in the seven- 
teenth century when modern science emerged and rapidly developed in 
close conjunction with integral and differential calculus, briefly called 
calculus, and mathematical analysis. 

The basic notions of Calculus are derivative and integral: the 
derivative is a measure for the rate of change, the integral a measure 
for the total effect of a process of continuous change. A precise under- 
standing of these concepts and their overwhelming fruitfulness rests 
upon the concepts of limit and of function which in turn depend upon 
an understanding of the continuum of numbers. Only gradually, by 
penetrating more and more into the substance of Calculus, can one 
appreciate its power and beauty. In this introductory chapter we shall 
explain the basic concepts of number, function, and limit, at first 
simply and intuitively, and then with careful argument. 


1.1 The Continuum of Numbers 


The positive integers or natural numbers 1, 2,3,... are abstract 
symbols for indicating “how many” objects there are in a collection or 
set of discrete elements. 

These symbols are stripped of all reference to the concrete qualities 
of the objects counted, whether they are persons, atoms, houses, or 
any objects whatever. 

The natural numbers are the adequate instrument for counting 
elements of a collection or “‘set.”” However, they do not suffice for 
another equally important objective: to measure quantities such as the 
length of a curve and the volume or weight of a body. The question, 
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“how much ?”, cannot be answered immediately in terms of the natural 
numbers. The profound need for expressing measures of quantities 
in terms of what we would like to call numbers forces us to extend the 
number concept so that we may describe a continuous gradation of 
measures. This extension is called the number continuum or the system 
of “real numbers” (a nondescriptive but generally accepted name). 
The extension of the number concept to that of the continuum is so 
convincingly natural that it was used by all the great mathematicians 
and scientists of earlier times without probing questions. Not until the 
nineteenth century did mathematicians feel compelled to seek a firmer 
logical foundation for the real number system. The ensuing precise 
formulation of the concepts, in turn, led to further progress in mathe- 
matics. We shall begin with an unencumbered intuitive approach, and 
later on we shall give a deeper analysis of the system of real numbers.1 


a. The System of Natural Numbers and Its 
Extension. Counting and Measuring 


The Natural and the Rational Numbers. The sequence of “natural” 
numbers 1, 2, 3,... is considered as given to us. We need not discuss 
how these abstract entities, the numbers, may be categorized from a 
philosophical point of view. For the mathematician, and for anybody 
working with numbers, it is important merely to know the rules or laws 
by which they may be combined to yield other natural numbers. These 
laws form the basis of the familiar rules for adding and multiplying 
numbers in the decimal system; they include the commutative laws 
a+b=b+a and ab = ba, the associative laws a+(b+c)= 
(a + b) + cand a(bc) = (ab)c, the distributive law a(b + c) = ab + ac, 
the cancellation law that a + c = b + c implies a = b, etc. 

The inverse operations, subtraction and division, are not always 
possible within the set of natural numbers; we cannot subtract 2 
from 1 or divide 1 by 2 and stay within that set. To make these 
Operations possible without restriction we are forced to extend the 
concept of number by inventing the number 0, the “negative” integers, 
and the fractions. The totality of all these numbers is called the class or 
set of rational numbers; they are all obtained from unity by using the 
“rational operations” of calculation, namely, addition, subtraction, 
multiplication, and division.” 

A rational number can always be written in the form p/q, where p 


1 A more complete exposition is given in What Is Mathematics? by Courant and 
Robbins, Oxford University Press, 1962. 

2 The word ‘‘rational’”’ here does not mean reasonable or logical but is derived from 
the word “‘ratio”’ meaning the relative proportion of two magnitudes. 
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and q are integers and q # 0. We can make this representation unique 
by requiring that q is positive and that p and q have no common factor 
larger than 1. 

Within the domain of rational numbers all the rational operations, 
addition, multiplication, subtraction, and division (except division by 
zero), can be performed and produce again rational numbers. As we 
know from elementary arithmetic, operations with rational numbers 
obey the same laws as operations with natural numbers: thus the 
rational numbers extend the system of positive integers in a com- 
pletely straightforward way. 


Graphical Representation of Rational Numbers. Rational numbers 
are usually represented graphically by points on a straight line L, 
the number axis. Taking an arbitrary point of L as the origin or point 0 
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Figure 1.1 The number axis. 


and another arbitrary point as the point I, we use the distance between 
these two points to serve as a scale or unit of measurement and define the 
direction from 0 to | as “positive.” The line with a direction thus 
imposed is called a directed line. It is customary to depict L so that 
the point | is to the right of the point 0 (Fig. 1.1). The location of any 
point P on L is completely determined by two pieces of information: 
the distance of P from the origin 0 and the direction from 0 to P (to the 
right or left of 0). The point P on L representing a positive rational 
number lies at distance x units to the right of 0. A negative rational 
number x is represented by the point —z units to the left of 0. In either 
case the distance from 0 to the point which represents x is called the 
absolute value of x, written |x|, and we have 
x, if x is positive or zero, 


ele P | 
=r, if x is negative. 


We note that |z| is never negative and equals zero only when z = 0. 
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From elementary geometry we recall that with ruler and compass it 
is possible to construct a subdivision of the unit length into any number 
of equal parts. It follows that any rational length can be constructed 
and hence that the point representing a rational number x can be 
found by purely geometrical methods. 

In this way we obtain a geometrical representation of rational 
numbers by points on L, the rational points. Consistent with our 
notation for the points 0 and 1, we take the liberty of denoting both the 
rational number and the corresponding point on L by the same symbol z. 

The relation x < y for two rational numbers means geometrically 
that the point z lies to the left of the point y. In that case the distance 
between the points is y — x units. If x > y, the distance is x — y units. 
In either case the distance between two rational points x, y of L is 
ly — x| units and is again a rational number. 


P 
+++ $_++———__——___ 1. 
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Figure 1.2 


A segment on L with end points a, b where a < b will be called an 
interval. The particular segment with end points 0, | is called the unit 
interval. If the end points are included in the interval, we say the interval 
is closed; if the end points are excluded, the interval is called open. 
The open interval, denoted by (a, b), consists of those points x for which 
a <x < b, that is, of those points that lie “between” a and b. The 
closed interval, denoted by [a, b], consists of the points x for which 
a<x<b. In either case the length of the interval is b — a. 

The points corresponding to the integers 0, +1, +2, ... subdivide the 
number axis into intervals of unit length. Every point on L is either 
an end point or interior point of one of the intervals of the subdivision. 
If we further subdivide every interval into q equal parts, we obtain a 
subdivision of L into intervals of length 1/q by rational points of the 
form p/g. Every point P of L is then either a rational point of the form 
plq or lies between two successive rational points p/q and (p + 1)/q¢ 
(see Fig. 1.2). Since successive points of subdivision are l/q units 
apart, it follows that we can find a rational point p/q whose distance 
from P does not exceed 1/q units. The number 1/q can be made as small 
as we please by choosing q as a sufficiently large positive integer. For 
example, choosing g = 10" (where 1s any natural number) we can 


1 The relation a < x (read ‘‘a less than or equal to x”) is interpreted as “‘either 
a<2x,ora =x.’ We interpret the double signs > and + in similar fashion. 
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find a “decimal fraction” x = p/10" whose distance from P is less than 
1/10". Although we do not assert that every point of L is a rational 
point we see at least that rational points can be found arbitrarily close 
to any point P of L. 


Density 


The arbitrary closeness of rational points to a given point P of L is 
expressed by saying: The rational points are dense on the number axis. 
It is clear that even smaller sets of rational numbers are dense, for 
example, the points x = p/10”, for all natural numbers n and integers p. 

Density implies that between any two distinct rational points a and 
b there are infinitely many other rational points. In particular, the 
point halfway between a and b, c = (a + b), corresponding to the 
arithmetic mean of the numbers a and b, is again rational. Taking the 
midpoints of a and c, of b and c, and continuing in this manner, we can 
obtain any number of rational points between a and b. 

An arbitrary point P on L can be located to any degree of precision 
by using rational points. At first glance it might then seem that the 
task of locating P by a number has been achieved by introducing the 
rational numbers. After all, in physical reality quantities are never 
given or known with absolute precision but always only with a degree 
of uncertainty and therefore might just as well be considered as measured 
by rational numbers. 


Incommensurable Quantities. Dense as the rational numbers are, 
they do not suffice as a theoretical basis of measurement by numbers. 
Two quantities whose ratio is a rational number are called commen- 
surable because they can be expressed as integral multiples of a common 
unit. As early as in the fifth or sixth century B.c. Greek mathematicians 
and philosophers made the surprising and profoundly exciting dis- 
covery that there exist quantities which are not commensurable with 
a given unit. In particular, line segments exist which are not rational 
multiples of a given unit segment. 

It is easy to give an example of a length incommensurable with the 
unit length: the diagonal / of a square with the sides of unit length. For, 
by the theorem of Pythagoras, the square of this length / must be equal 
to 2. Therefore, if / were a rational number and consequently equal to 
plq, where p and q are positive integers, we should have p? = 2q?. We 
can assume that p and g have no common factors, for such common 
factors could be canceled out to begin with. According to the above 
equation, p? is an even number; hence p itself must be even, say p = 2p’. 
Substituting 2p’ for p gives us 4p’* = 2q?, or q? = 2p’; consequently, q* 
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is even and so g is also even. This proves that p and q both have the 
factor 2. However, this contradicts our hypothesis that p and q have 
no common factor. Since the assumption that the diagonal can be 
represented by a fraction p/q leads to a contradiction, it is false. 

This reasoning, a characteristic example of indirect proof, shows that 


the symbol V2 cannot correspond to any rational number. Another 
example is 7, the ratio of the circumference of a circle to its diameter. 
The proof that ~ is not rational is much more complicated and was 
obtained only in modern times (Lambert, 1761). It is easy to find many 
incommensurable quantities (see Problem 1, p. 106); in fact, incom- 
mensurable quantities are in a sense far more common than the 
commensurable ones (see p. 99). 


Irrational Numbers 


Because the system of rational numbers is not sufficient for geom- 
etry, it is necessary to invent new numbers as measures of incommen- 
surable quantities: these new numbers are called “irrational.” The 
ancient Greeks did not emphasize the abstract number concept, but 
considered geometric entities, such as line segments, as the basic 
elements. In a purely geometrical way, they developed a logical 
system for dealing and operating with incommensurable quantities 
as well as commensurable (rational) ones. This important achieve- 
ment, initiated by the Pythagoreans, was greatly advanced by Eudoxus 
and 1s expressed at length in Euclid’s famous Elements. In modern 
times mathematics was recreated and vastly expanded on a foundation 
of number concepts rather than geometrical ones. With the introduction 
of analytic geometry a reversal of emphasis developed in the ancient 
relationship between numbers and geometrical quantities and the 
classical theory of incommensurables was all but forgotten or disre- 
garded. It was assumed as a matter of course that to every point 
on the number axis there corresponds a rational or irrational number 
and that this totality of “real”? numbers obeys the same arithmetical 
laws as the rational numbers do. Only later, in the nineteenth century, 
was the need for justifying such an assumption felt and was eventually 
completely satisfied in a remarkable booklet by Dedekind which makes 
fascinating reading even today.! 


1 R. Dedekind, “Nature and Meaning of Number” in Essays on Number, London 
and Chicago, 1901. (The first of these essays, “Continuity and Irrational Numbers,” 
supplies a detailed account of the definition and laws of operation with real num- 
bers.) Reprinted under title Essays on the Theory of Numbers, Dover, New York, 
1964. The original of these translations appeared in 1887 under the title “Was sind 
und wass sollen die Zahlen ?” 
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In effect, Dedekind showed that the “naive” approach practiced 
by all the great mathematicians from Fermat and Newton to Gauss 
and Riemann was on the right track: That the system of real numbers 
(as symbols for the lengths of segments, or otherwise defined) is a 
consistent and complete instrument for scientific measurement, and that 
in this system the rules of computation of the rational number system 
remain valid. 

Without harm, one could leave it at that and turn directly to 
the substance of calculus. However, for a deeper understanding of the 
concept of real number, which is necessary for our later work, the 
following account as well as the Supplement to this chapter should be 
Studied. 


b. Real Numbers and Nested Intervals 


For the moment let us think of the points on a line L as the basic 
elements of the continuum. We postulate that to each point on L 
there corresponds a “real number” x, its coordinate, and that for these 
numbers x, y the relationships just described for the rational numbers 
retain their meaning. In particular, the relationship x < y indicates 
order on L and the expression |y — x| means the distance between the 
point x and the point y. The basic problem is to relate these numbers 
(or measurements on the geometrically given continuum of points) to 
the rational numbers considered originally and hence ultimately to 
the integers. In addition, we have to explain how to operate with the 
elements of this “‘number-continuum” in the same way as with the 
rational numbers. Eventually, we shall formulate the concept of the 
continuum of numbers independently of the intuitive geometric con- 
cepts, but for the present we postpone some of the more abstract 
discussion to the Supplement. 

How can we describe an irrational real number? For some numbers 


such as J/2 or 7, we can give a simple geometric characterization, but 
that is not always feasible. A method flexible enough to yield every real 
point consists in describing the value x by a sequence of rational 
approximations of greater and greater precision. Specifically, we shall 
approximate x simultaneously from the right and from the left with 
successively increasing accuracy and in such a way that the margin 
of error approaches zero. In other words, we use a “sequence” of 
rational intervals containing x, with each interval of the sequence 
containing the next one, such that the length of the interval, and with 
it the error of the approximation, can be made smaller than any specified 
positive number by taking intervals sufficiently far along in the sequence. 
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To begin, let x be confined to a closed interval /, = [a,, b,], that is, 
dı < x < b, 


where a, and b, are rational (see Fig. 1.3). Within 7, we consider a 
“subinterval” J, = [a, bg] containing z, that is, 


4 S a [L x < b, < bı, 


where a, and b, are rational. For example, we may choose for J, one 
of the halves of /,, for x must lie in one or both of the half-intervals. 
Within J, we consider a subinterval 7, = [a3, b] which also contains z: 


aı < a, < a; < x < b; < b: < bı, 


where a and b; are rational, etc. We require that the length of the 
interval 7„ tends to zero with increasing n; that is, that the length of 
Ia is less than any preassigned positive number for all sufficiently 
large n. A set of closed intervals /,, Ia, I3, ... each containing the 


x 


aj a2 Gn An+1 bn+1 On bo b 
Figure 1.3 A nested sequence of intervals. 


next one and such that the lengths tend to zero will be called a “‘nested 
sequence of intervals.” The point x is uniquely determined by the 
nested sequence; that is, no other point y can lie in all /,, since the 
distance between x and y would exceed the length of J, once n is suffi- 
ciently large. Since here we always choose rational points for the end 
points of the 7, and since every interval with rational end points is 
described by two rational numbers, we see that every point x of L, 
that is, every real number, can be precisely described with the help of 
infinitely many rational numbers. The converse statement is not so 
obvious; we shall accept it as a basic axiom. 


POSTULATE OF NESTED INTERVALS. Jf l, lz, /3,... form a nested 
sequence of intervals with rational end points, there is a point x contained 
in all 1,2 


As we shall see, this is an axiom of continuity: it guarantees that no 
gaps exist on the real axis. We shall use the axiom to characterize 
the real continuum and to justify all operations with limits which are 


1 It is important to emphasize for a nested sequence that the intervals /, are closed. 
If, for example, 7, denotes the open interval 0 < x < 1/n, then each 7, contains the 
following one and the lengths of the intervals tend to zero; but there is no x 
contained in all /,. 
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basic for calculus and analysis. (There also are many other ways of 
formulating this axiom as we shall see later.) 


c. Decimal Fractions. Bases Other than Ten 


Infinite Decimal Fractions. One of the many ways of defining real 
numbers is the familiar description in terms of infinite decimals. It is 
entirely possible to take the infinite decimals as the basic objects rather 
than the points of the number axis, but we would rather proceed in a 
more suggestive geometrical way by defining the infinite decimal repre- 
sentation of real numbers in terms of nested sequences of intervals. 

Let the number axis be subdivided into unit intervals by the points 
corresponding to integers. A point x either lies between two successive 
points of subdivision or is itself one of the dividing points. In either 
case there is at least one integer cy such that 


CO St cot, 


so that x belongs to the closed interval J) = [co, co + 1]. We divide 
J, into ten equal parts by points co + yo, Co + os... , Co + 1%- 
The point x must then belong to at least one of the closed subintervals 
of J) (possibly to two adjacent ones if x is one of the points of subdi- 
vision). In other words, there is a digit c, (that is, one of the integers 0, 1, 
2,..., 9) such that x belongs to the closed interval /, given by 


1 1 1 
Co + ioi S T L CoH ioci + ro. 


Dividing /, in turn into ten equal parts, we find a digit c such that x 
lies in the interval /, given by 


1 1 : 1 1 1 
Co + roi + 10002 S T <L Co + roli + 100C2 + 100- 


We repeat this process. After n steps x is confined to an interval 7, 
given by 
Pda nt ha coe a ent, ele 
° 107 10 STS 10? 10” * 10°” 
where c}, Ca, . . . are all digits. The interval 7, has length 1/10”, which 
tends to zero for increasing n. It is clear that the /,, form a nested set of 
intervals, and hence that x is determined uniquely by the /,,. Since the 
[,, are known, once the numbers co, Ci, C2,. . . are given we find that an 
arbitrary real number can be described completely by an infinite 
sequence of integers Co, C1, C2,..., Where all except the first are digits, 
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having values from zero to nine only. In ordinary decimal notation the 
connection between x and co, Cy, Ca, . . . is indicated by writing 


c= Co + 0.¢;CeC3 oe R 


(Usually, the integer cy itself is also written in decimal notation if cy 
is positive.) Conversely, by the axiom of continuity, every such 
expression denoting an infinite decimal fraction represents a real number. 

It is possible that there are two different decimal representations of 
the same number; for example, 


1 = 0.99999 --- = 1.00000:--. 


In our construction the integer cy is determined uniquely by x unless x 
itself is an integer. In that case we could choose either cg = x or 
Co = x — 1. Once a choice has been made the digit c, is unique unless 
x is one of the new points subdividing J into ten equal parts. Con- 
tinuing we find that c, and all c, are determined uniquely by x unless x 
occurs as a point of subdivision at some stage. If this should happen 
for the first time at the nth stage, then 


C, 


(me 


ee een 

10 10” 
where ci, C,-..,¢, are digits and where c, > 0, since otherwise x 
would have been a point of subdivision at an earlier stage. It follows 
that 7,,, is either the interval [7,2 + 1/10"*1] or the interval 
[x — 1/10"*4, x]. In the first case x will be the left-hand end point of 
all later intervals /,,., 7,,43...., and in the second case, the right-hand 
end point. We are then led either to the decimal representation 


% = Cy + 0.c\C,°°+ ¢,000°°: 
or the representation 
“= Co + 0.¢,C2° n (c,, = 1)99999 ota 


Hence the only case in which an ambiguity can arise is for a rational 
number x which can be written as a fraction having a power of ten 
for its denominator. We can eliminate even this ambiguity by excluding 
decimal representations in which all digits from a certain point on are 
nines. 

In the decimal representation of real numbers the special role played 
by the number ten is purely incidental. The only evident reason for 
the widespread use of the decimal system is the ease of counting by 
tens on our fingers (digits). Any integer p greater than one can serve 
equally well. We could use p equa! subdivisions at each stage. A real 
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number x would then be represented in the form 
c= Co + 0.0, CC 3 py hg 


where Cp is an integer, and now c), C», ... have one of the values 
0,1,2,...,p—1. This representation again characterizes x by a 
nested set of intervals, namely 


1 1 
ot at tae Se Smt gt tet, 
p Pp P 


If x is positive or zero, the integer cy is also positive or zero and cy 
itself has a finite expansion of the form 


Co = dy + pd, + p?d, + +--+ p*d,, 


where do, d,,...,d, take one of the values 0,1,...,»—1. The 
complete representation of x “to the base p” takes the form 


t= addy 4 a hat ddp.CCoC3 Oe 


If x is negative, we may use this kind of representation for —z. 


101.01 
panacea 


L 
NX 
0 l 10 11 100 101 Fo; 7 221 


Figure 1.4 The fraction * in the binary system. 


Bases other than 10 have actually been used extensively. Following 
the lead of the ancient Babylonians, astronomers for many centuries 
consistently represented numbers as ‘“‘sexagesimal’’ fractions with 
p = 60 as the base. 


Binary Representation. The “binary” system with the base p = 2 
has special theoretical interest and is useful in the logical design of 
computing machines. In the binary system the digits have only two 
possible values, zero and one. The number 4, for example, would be 
written 101.01 corresponding to the formula 


Sa PLE BOF -0F 501 (see Fig. 1.4). 
Calculating with Real Numbers. Although the definition of real 


numbers and their infinite decimal or binary representations, etc., are 
straightforward, it may not seem obvious that one can operate with the 
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number continuum exactly as with rational numbers, performing 
the rational operations and retaining the laws of arithmetic, such as the 
associative, the commutative, and the distributive laws. The proof is 
simple, although somewhat tedious. Instead of impeding the way to the 
live substance of analysis by taking up the question here, we shall 
accept temporarily the possibility of ordinary arithmetic calculation 
with the real numbers. A deeper understanding of the logical structure 
underlying the number concept will come when we discover the idea of 
limit and its implications. (See the Supplement to this chapter, p. 89.) 


d. Definition of Neighborhood 


Not only the rational operations but also order relations or in- 
equalities for real numbers obey the same rules as for the rational 
numbers. 

Pairs of real numbers a and b with a < b again give rise to closed 
intervals [a, b] (given by a < x < b) and open intervals (a, b) (given by 
a < x< b). Frequently we shall be led to associate with a point z the 
various open intervals that contain that point or specifically have it as 
center, which we shall call neighborhoods of the point. More precisely, 
for any positive e the e-neighborhood of the point x, consists of the 
values x for which zọ — e < x < Tọ + €, that is, it is the interval 
(£o — €, % + €). Any open interval (a, b) containing a point 2» always 
also contains a whole neighborhood of 2p. 

Having defined intervals with real end points we can now form nested 
sequences of intervals using the same definition as in the case of rational 
end points. It is most important for the logical consistency of calculus 
that for any nested sequence of intervals with real end points there is a 
real number contained in all of them. (See Supplement, p. 95.) 


e. Inequalities 
Basic Rules 


Inequalities play a far larger role in higher mathematics than in 
elementary mathematics. Often the precise value of a quantity x is 
difficult to determine, whereas it may be easy to make an estimate of z, 
that is, to show that x is greater than some known quantity a and less 
than some other quantity b. For many purposes, only the information 
contained in such an estimate of x is significant. We shall therefore 
briefly recall some of the elementary rules about inequalities. 

The basic fact is that the sum and product of two positive real 
numbers are again positive; that is, ifa > 0and b > 0, thena + b > 0 
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and ab > 0. Moreover, we rely on the fact that the inequality a > b is 
equivalent to a— b > 0. Consequently, two inequalities a > b and 
c > d can be added to yield the inequality a+ c > b + d since 


(a + c) — (b + d) = (a — b) + (c — d) 


is positive as the sum of two positive numbers. (Subtracting the 
inequalities to obtain a — c > b — d is not legitimate. Why?) An 
inequality can be multiplied by a positive number; that is, if a > b and 
c > 0, then ac > be. For the proof, we observe that 


ac — bc = (a — b)c 


is positive since it is the product of positive numbers. If c is negative, 
we can conclude from a > b that ac < be. More generally, it follows 
from a > b > 0 and ¢ > d > 0 that ac > bd. 

It is geometrically obvious that inequality is transitive; that is, if 
a > b and b > c, thena>c. Transitivity? also follows immediately 
from the positivity of the sum 


(a—b)+(b—c)=a-ce. 
The preceding rules also hold if we replace the sign > by > everywhere. 
Let a and b be positive numbers and observe that 
aè — b? = (a+ bya — b). 


Since a + b is positive, we conclude that a? > b? follows from a > b. 
Thus an inequality between positive numbers can be “squared.” 
Similarly, a? > b? whenever a > b > 0. From the equation 


a — b = . (a? — b’), 
a+b 
valid for all positive a and b, it follows that the converse is also true; 
that is, for positive a and b, a? > b? implies a > b. Applying this 
result to the numbers a = Vz, b = Vy, for arbitrary positive real 
numbers z, y, we find? that Vz > Vy when x > y. More generally, 
Vz > Vy whenever x > y > 0. Hence it is legitimate to take the 


1 Transitivity justifies the use of the compound formula “a < b < c..." to express 
“a < band b <c, etc.” Avoid nontransitive arrangements like x < y > z; these 
are confusing and misleading. 


2 Here and hereafter the symbol V z for z > 0 denotes that nonnegative number 
whose square is z. With this convention lc| = V c for any real c since |c| > 0 and 
\cl? = c?. From this we obtain the important identity |zy| = |æ] - |y| since 


|zy|? = (xy)? = z?y? = (|z| - |y). 
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square root of both sides of an inequality between nonnegative real 
numbers. 

Suppose that a and b are positive and n is a positive integer. In the 
factorization 


a" — b” = (a L2 b)(a"-! + a”-—?b + Dia + b"-) 


the second factor is positive. Thus a” — b” has the same sign as a — b; 
if a” > b”, then a > b and if a” < b”, thena < b. 

Most inequalities we shall encounter occur in the form of estimates 
for the absolute value of a number. We recall that |x| is defined to be 
x for x > 0 and —z for x < 0. We may also say that |x| is the larger 
of the two numbers x and —z when 7v is not zero and is equal to both 
of them when z is zero. The inequality |x| < a then states that neither 
x nor —zx exceeds a, that is, that x < a and —x < a. Since —x < a is 
equivalent to x > —a, we see that the inequality |x| < a means that x 


i ee 


yo=-a xn xota 
Figure 1.5 The interval |£ — zo| < a. 


lies in the closed interval —a < x < a with center O and length 2a. 
The inequality |z — zo| < a then states that —a < x — x, < a or that 
Xo — a [ŞS x < zt + a, thus, that z lies in the closed interval with center xo 
and length 2a (see Fig. 1.5). Similarly, the e-neighborhood (x — e, 
xa + €) of a point x, that is, the open interval 77 — e€ < x < tọ + €, 
can be described by the inequality |z — zo| < €. 


Triangle Inequality 


One of the most important inequalities involving absolute values is 
the so-called triangle inequality 


la + b| < |a| + |b| 


for any real a, b. The name “triangle inequality” is more appropriate 
for the equivalent statement 


lx — B| < |æ — yi + Iy — PI 


for which we have set a = « — y, b = y — B. The geometrical inter- 
pretation of this statement is that the direct distance from « to f is 
less than or equal to the sum of the distances via a third point y; (this 
also corresponds to the fact that in any triangle the sum of the two 
sides exceeds the third side). 

A formal proof of the triangle inequality is easily given. We dis- 
tinguish the cases a + b > 0 and 2+5<0. In the first case the 
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inequality states that a + b < |a| + |b|: but this follows trivially by 
addition of the tnequalities a < Ja| and b < |b]. In the second case 
the triangle inequality reduces to —(a + b) < |a| + |b|, which again 
follows by addition from —a < |a|, —b < Ibl. 

We immediately derive an analogous inequality for three quantities: 


la +b + cl < lal + Ibl + Iel; 
for, by applying the triangle inequality twice, 
la +b + c| = |(a + b) + c| < la + b| + lel < lal + |b] + Iel. 
In the same way, the more general inequality 
la +a + 0+ + a) S lal + la| +: lal 
is derived. 


Occasionally we need estimates for |a + b| from below. We observe 
that 


la| = |(a + b) + (—b)| < |a + b| + |—b| = |a + b| + lb] 
and hence that the inequality 


la + b| > |a| — [5 
holds. 


The Cauchy-Schwarz Inequality 


Some of the most important inequalities exploit the obvious fact 
that the square of a real number is never negative and that conse- 
quently a sum of squares also cannot be negative. One of the most 
frequently used results obtained in this way is the Cauchy-Schwarz 
inequality 


(a,b, 5 ab» + ie + a,,b,,)° 


< (ap tay te: Ha, btt bit: bi). 
Putting 


A=af +a +: +a, 
B = abi + aba + °t + ,5,, 
C= b+ bo +--+ + 6,7, 


the inequality becomes AC > B?. To prove it we observe that for any 
real ¢ 
0 < (a, +: tb)? + (az T tb)? poi e (a, T tb,,)” 


since the right-hand side is a sum of squares. Expanding each square 
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and arranging according to powers of ¢, we find that 
0<A4+2B8t4+ CP 


for all t, where A, B, C have the same meaning as before. Here C > 0. 
We may assume that C > 0, since certainly B? = AC =0 when 
C= 0. Substituting then for ¢ the special value £? = —B/C [corre- 
sponding to the minimum of the quadratic expression 


B\ B 
A+2Bt+ Ctt =clrt+— — — 
F = (1+ 2) + (4 =) | 


we find 
2 2 _ pe 
oca, BACH B 
C C C 


and hence AC — B? > 0. 


Figure 1.6 Geometric and arithmetic means of x and y. 


In the special case n = 2 we can choose 


ay = Vz, ae = Vy, b = Vy, b = Vrz, 


where x and y are positive numbers. The inequality then takes the 
form (2V xy)? < (x + y}? or 


ciao 
Vay <— 5 
This inequality states that the geometric mean Vzy of two positive 
numbers 2, y never exceeds their arithmetic mean (x + y)/2. The 
geometric mean of two numbers x, y can be interpreted as the length 
of the altitude of a right triangle dividing the hypotenuse into seg- 
ments of length x and y respectively. The inequality then states that 
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in a right triangle the altitude does not exceed half the hypotenuse (see 
Fig. 1.6). 


1.2 The Concept of Function 


From the beginning of modern mathematics in the 17th century the 
concept of function has been at the very center of mathematical thought. 
(Leibnitz appears to have been the first to use the word ‘“‘function’’.) 
Although the idea of functional relationships is significant far beyond 
the mathematical domain, we shall naturally focus our attention on 
functions in the mathematical sense, that is, on the connection of 
mathematical quantities by mathematical relations or prescriptions or 
“operations.” A very large part of mathematics and the natural sciences 
is dominated by functional relationships, for they occur everywhere in 
analysis, geometry, mechanics, and other fields. For example, the 
pressure in an ideal gas is a function of density and temperature; the 
position of a moving molecule is a function of the time; the volume 
and surface of a cylinder are functions of its radius and height. When- 
ever the values of certain quantities a, b, c,... are determined by those 
of certain others x, y, z,..., we say thata, b,c,...dependonz, y,z,... 
or are functions of x,y, z, .... Examples of functional relations are 
given by formal expressions such as the following. 


(a) The formula A = a? defines A as a function of a. Fora > 0 we 
can interpret A as the area of a square of side a. 

b) The formula 

(b) M ee 


defines y as a function of x for all x for which —1 <x < 1. For 
x > 0 this function expresses the side y of a right triangle with hypot- 
enuse | in terms of the other side zx. 

(c) The equations 


t=t, yo —f 


assign values of x and y to each z and thus define x and y as functions 
of t. If we interpret x and y as the rectangular coordinates of a point P 
in the plane and z as the time, then our equations describe the location 
of P at the time ft; in other words, they describe the motion of the 
point P. 

(d) The equations 


x y 


Hy’ o py 


1 The interested reader will find more material in An Introduction to Inequalities, 
by E. F. Beckenbach and R. Bellman, Random House, 1961, and Geometric 
Inequalities, by N. Kazarinoff, Random House, 1961. 
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define a and b as functions of x and y for x? + y? # 0. Interpreting the 
pairs of values x, y and a, b as rectangular coordinates of two points, 
we see that the equations assign to each point (x, y) [with the exception 
of the origin (0, 0)] an “image” (a, b). The reader can verify easily that 
the image (a, b) always lies on the same ray from the origin as the 
“original” or “antecedent” (x, y) and has the reciprocal distance from 
the origin. We speak of “mapping” (x,y) onto (a,b) by means of 
the equations expressing a, b in terms of x, y. 


In the preceding examples the functional law is expressed by simple 
formulas which determine certain quantities in terms of certain others.? 
The quantities appearing on the left-hand sides, the “dependent 
variables,” are expressed in terms of the “independent variables” on 
the right. The mathematical law assigning unique values of the 
dependent variables to given values of the independent variables is 
called a function. It is unaffected by the names z, y, etc., for these 
variables. In Example c we have an independent variable ¢ and two 
dependent variables x, y, whereas in Example d there are two independ- 
ent variables x, y and two dependent variables a, b. 

The dependence of y on x by a functional relation is frequently 
indicated by the brief expression “y is a function of x.’ 


a. Mapping-Graph 
Domain and Range of a Function 


We usually interpret the independent variables geometrically as 
coordinates of a point in one or more dimensions. In Example b this 
would be a point on the z-axis, in Example d a point in the z,y-plane. 
Sometimes the independent variables are free to take all values, as in 
examples a and c. Often, however, there is some restriction, inherent 
or imposed, and our functions are not defined for all values. The set 
of values or the points for which a function is defined form the 
“domain” of the function. In Example a the domain is the whole 
a-axis, in b the interval —1 < x < 1, in c the whole f-axis, and in d 
the points of the z,y-plane different from the origin. 

To each point P in the domain our functions assign definite values 


1 Later we shall gradually realize the need for considering functions not capable of 
such representation by simple formulas. (See, for example, p. 25.) 

2 This locution is used freely in the sciences, but some of the more pedantic texts 
avoid it. There is no point in hampering ourselves by an undue concern for hair- 
splitting “precision”? when it has no relation to the substance. 
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for the dependent variables. These values also can be interpreted as 
coordinates of á point Q, the image of P. We say that P is “mapped” 
by our functions onto the point Q. Thus in Example d the point 
P = (1,2) of the x,y-plane is mapped onto the point Q = (2, $) of the 
a,b-plane. The image points Q form the range of the function.! Each 
Q in the range is the image of one (or more) points in the domain of 
the function. 

In Example c points of the f-axis have as their images points in 
the x,y-plane. The f-axis is mapped into the z,y-plane. But not every 
point of the x,y-plane occurs as image, only those for which y = —2?. 
Thus the range of the mapping is the parabola y = —x®. We say, 
the f-axis is mapped onto the parabola y = —2?, in the sense that the 
image points fill this parabola. 

In Example d the range consists of the points (a, b) in the a,b-plane 
whose coordinates can be written in the form a = z/(x* + y*), b= 
y/(x? + y?) with suitable x, y for which x? + y? #0. In other words, 
the range consists of those points (a, b) for which the preceding equations 
have a solution (x, y). As seen immediately the range consists of the 
points (a, b) for which a and b do not both vanish; each such point 
(a, b) is image of the point x = a/(a* + b”), y = b/(a? + b?). Every 
geometrical figure in the 2,y-plane is then mapped onto a corresponding 
figure in the a,b-plane which consists of the images of the points of the 
first figure. For example, a circle xz? + y? = r°? about the origin is 
mapped onto the circle a? + b? = 1/r® in the a,b-plane. 

In this and the following chapters we shall deal almost exclusively 
with a single independent variable, say x, and a single dependent vari- 
able, say y, as indicated in Example b.? Ordinarily we represent such 
a function in the standard way by its graph in the x,y-plane, that is, 
by the curve consisting of those points (x, y) whose ordinate is in the 
specified functional relationship to the abscissa x (see Fig. 1.7). For 
Example b the graph is the upper half of a circle of radius one about the 
origin. 

The interpretation of the function as a mapping of a domain on the 
x-axis onto a range on the y-axis leads to a different visualization of 
functions. We interpret x and y not as coordinates of the same point 
in the x,y-plane, but as points on two different, independent number 


1 It is often convenient to talk of the point Q as ‘‘a function” of P, although in the 
analytic representation several functions expressing the different coordinates of Q 
appear. 

2 However, it should be emphasized from the beginning that functions of several 
variables occur just as naturally in many instances. They will be discussed systemat- 
ically in Volume II. 
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Figure 1.7 Graph of function. 


axes. Then the function maps a point x on the z-axis into a point y 
on the y-axis. Such mappings arise frequently in geometry, such as the 
“affine” mapping which originates by projecting a point x on the z-axis 
onto a point y on a parallel y-axis from a center 0 located in the plane 
of the two axes (see Fig. 1.8). This mapping can be expressed analyt- 
ically, as easily ascertained, by the linear function y = ax + b with 


y=ax+b 


Figure 1.8 Mappings. 
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constants a and b. Obviously, it is a “one-to-one” mapping in which 
inversely to the image y, there corresponds a unique original x. Another, 
more general, example is the “perspective mapping” defined by the same 
sort of projection, only with the two axes not necessarily parallel. 
Here the analytical expression is given by a rational linear function of 
the form y = (ax + b)/(cx + d), with constants a, b, c, d. 

Any projection of a surface S in space into another surface S’ from 
some center N can be viewed as a mapping whose domain is S and 
whose range lies on S’. For example, we can map a sphere onto an 
equatorial plane by projecting each point P of the sphere onto a point 
P’ of the plane by rays from the North Pole (see Fig. 1.9). This mapping 


Figure 1.9 Stereographic projection. 


is the “‘stereographic projection” used frequently for maps of the earth. 
The interpretation of functions as “maps” is suggested by examples of 
this type. 

When more independent or dependent variables are involved, the 
definition of functions by mapping provides a more flexible and suitable 
interpretation than that by graphs. This fact will become fully apparent 
in the second volume. 


b. Definition of the Concept of Functions of a Continuous 
Variable. Domain and Range of a Function 


A function of a single independent variable x assigns values y to 
values x. The domain of the function is the totality of values x for 
which the function is defined. In the cases that concern us most the 
domain of the function consists of one or several intervals (see Fig. 
1.10). We say then that y is a function of a continuous variable (in 
contrast to other cases where, for example, the function might only be 
defined for rational or for integral values of x). Here the “intervals” 
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forming the domain may or may not contain their end points and may 
also extend to infinity in one or both directions.1_ Thus the function 
y = V1 — 2? is defined in the closed interval —1 <x< +l, the 
function y = 1/x in the two semi-infinite open intervals x < 0 and 
x > 0, the function y = z? in the infinite “interval” — œ <x < +0 


y 


Domain 


Figure 1.10 Domain and range of a function in graphical representation. 


consisting of all x, the function y = y(x? — 1)(4 — z?) in the two 
separate intervals 1 < x < 2 and —2<2< —1. 

Functions are denoted by symbols such as f, F, g, etc. The corre- 
sponding relations between x and the associated y-values are written in 
the form y = f(x) or y = F(x) or y = g(x), etc., or also sometimes 
y = y(x) to indicate? that y depends on x. If, for example, f(x) is 
defined by the expression z? + 1 we have f (3) = 3? + 1 = 10,f(—!) = 
(=1)?-— r=2,; 


1 Ordinarily we will reserve the word “interval” for “bounded,” that is, “finite” 
intervals, that have definite finite end points; then one might indicate the more 
comprehensive concept as used in the text, by the word *‘convex sets,” meaning 
sets which when containing two points must contain all intermediate ones. 

2 In this notation we try to emphasize the variables and do not explicitly indicate 
the functional operation by a symbol such as f. The notation 


f: xt —> y 


for the function f mapping z into y is also sometimes encountered. 
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In the general definition of a function f(x) nothing is said about 
the nature of the relation by which the dependent variable is found 
when the independent variable is given. As said before, often the 
function is given in “closed form” by a simple expression like f(x) = 
x? + 1 or f(z) = V1 + sin? x, and in the early days of the calculus 
such explicit expressions were mostly what mathematicians meant by 
functions. Often mechanical devices generate geometric curves or 


y 


Figure 1.11 


graphs which then define functions. A striking example is the cycloid, 
a curve described by a point fixed on a circle which rolls along the 
x-axis (see Fig. 1.11). Its functional analytical expression by formulas 
will be given later (see p. 328). 

Logically, we are not restricted to such geometrically or mechanically 
generated functions. Any rule by which a value of y is assigned to 
values of x constitutes a function. In some theoretical investigations 
the wide generality or vagueness of the function concept is, in fact, 
an advantage. However, for applications, particularly in the calculus, 
the general concept of function is unnecessarily wide. To make 
meaningful mathematical developments possible, the “arbitrary” laws 
of correspondence by which a value of y is assigned to x must be 
subjected to radical restrictions. During the past century and a half 
mathematicians have recognized and formulated in precise terms the 
essential restrictions that have to be imposed on the overly general 
concept in order to obtain functions that indeed have the useful 
properties one would expect intuitively. 
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*Extended or Restricted Domains of Functions 


Even for functions given by explicit formulas, it is important to realize 
that any complete description of a function must include a definition of the 
domain of the function. For us the “function” f described by “f(x) = x? for 
0 <x <2” is not strictly the same function as the function g given by 
“g(x) = x? in the larger domain —2 < x < 2,” although f(x) and g(x) have 
the same values in the interval 0 < x < 2 where both are defined. Generally, 
we call a function f a “restriction” of a function g (or g an “extension” of f), 
if, wherever f is defined, g is also defined and assumes the same values. Of 
course, the same function f can arise by restriction from many different 
functions. In our example above f is also a restriction of the function h 
defined by A(x) = x? for 0 <a <2, A(x) = —x? for -2 <x <0. Asa 
matter of fact this example illustrates the process inverse to that of forming 
restrictions of a function which might be called “piecing together”; we can 
generate new functions by simply defining them by different explicit expres- 
sions in different portions of the domain. 


c. Graphical Representation. Monotonic Functions 


The fundamental idea of analytical geometry is to give an analytical 
representation to a curve originally defined by some geometrical 
property. This is done usually by regarding one of the rectangular 
coordinates, say y, as a function y = f(x) of the other coordinate x; 
for example, a parabola is represented by the function y = 2°, the 


circle with radius 1 about the origin by the two functions y = VI- r 


and y = —y 1 — z?. In the first example we may think of the function 
as defined in the infinite interval —co < x < œ; in the second we 
must restrict ourselves to the interval —1 < x < 1, since outside this 
interval the function has no meaning.! 

Conversely, if instead of starting with a curve defined geometrically 
we consider a function y = f(x) given analytically, we can represent 
the functional dependence of y on x graphically, using a rectangular 
coordinate system in the usual way (cf. Fig. 1.7). If for each abscissa 
x we take the corresponding ordinate y = f(x), we obtain the geo- 
metrical representation of the function. The restrictions to be imposed 
on the function concept should secure for its geometrical representation 
the shape of a “reasonable” geometrical curve. This, it is true, expresses 
an intuitive feeling rather than a strict mathematical condition. How- 
ever, we shall soon formulate conditions, such as continuity, differenti- 
ability, etc., which insure that the graph of a function is a curve capable 


1 We do not ordinarily consider imaginary or complex values of x and y. 
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of being visualjzed geometrically. This would not be the case if we 
admitted “‘pathological’’ functions such as the following: For every 
rational value of x, the function y has the value 1; for every irrational 
value of x, the value of y is 0. This functional prescription assigns a 
definite value of y to each x; but in every interval of x, no matter how 
small, the value of y jumps from 0 to 1 and back an infinite number of 
times. This example demonstrates that the general unrestricted func- 
tion concept may lead to graphs which we would not consider as curves. 


Multivalued Functions 


We consider only functions y = f(x) assigning a unique value of y 
to each value of x in the domain, as, for example, y = x? or y = sin z. 
Yet, for a curve described geometrically, it may happen, as for the 
circle x? + y? = |, that the whole course of the curve is not given by 
just one (single-valued) function, but requires several functions—in the 


case of the circle, the two functions y = V1 — z? and y = aa — x’, 
The same is true for the hyperbola y? — x? = 1, which is represented 
by the two functions y = V1 + z? and y = =A) ] + x7, Such curves 
therefore do not determine unambiguously the corresponding functions. 
It is sometimes said that the curve is represented by a multivalued 
function; the separate functions representing it are then called the 
single-valued branches of the multivalued function belonging to the 
curve. For the sake of clarity we shall always use the word “function” 


to mean a single-valued function. For example, the symbol Vz (for 
x > 0) will always denote the nonnegative number whose square 
IS x. 

If a curve is the graph of one function, it is intersected by any parallel 
to the y-axis in at most one point, since to each point x in the interval 
of definition there corresponds just one value of y. The unit circle 
represented by the two functions 


y=VJV1—2 and y= -VI — 2, 


is intersected by such parallels to the y-axis in more than one point. 
The portions of a curve corresponding to different single-valued 
branches are sometimes connected with each other so that the complete 
curve is a single figure which can be drawn with one stroke of the pen, 
for example, the circle (cf. Fig. 1.12); on the other hand, these portions 
may be completely separated, as for the hyperbola (cf. Fig. 1.13). 


Examples. Let us consider some further examples of the graphical 
representation of functions. 
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Figure 1.12 Figure 1.13 
(a) y is proportional to z, 
y = ax. 
The graph (see Fig. 1.14) is a straight line through the origin of the 


coordinate system. 
(b) y is a “linear function” of x, 


y =ar +b. 


y 


Figure 1.14 Linear functions. 
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The graph is a straight line through the point x = 0, y = b, which, 
if a ~ 0, also passes through the point x = — b/a, y = 0, and if a = 0 
is horizontal. 

(c) y is inversely proportional to x, 


In particular, for a = 1 


so that 
y=1 for ass=], y=2 for x=}, y=4 for r=2. 


The graph (cf. Fig. 1.15) is a rectangular hyperbola, a curve 
symmetrical with respect to the bisectors of the angles between the 
coordinate axes. 

This function is obviously not defined for the value x = 0 since 
division by zero has no meaning. In the neighborhood of the exceptional 
point x = 0, the function has arbitrarily large values, both positive 
and negative; this is the simplest example of an infinite discontinuity, 
a concept which we shall discuss later (see p. 35). 


Figure 1.15 Infinite discontinuity. 
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0 
Figure 1.16 Parabola. 


(d) y is the square of z, 
y= a, 
As is well known, this function is represented by a parabola (see 
Fig. 1.16). 


Similarly, the function y = zè is represented by the so-called cubical 
parabola (see Fig. 1.17). 


Figure 1.17 Cubical parabola. 
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Monotone Functions 


A function which for all values of x in an interval has the same value 
y = ais called a constant; it is represented graphically by a horizontal 
Straight line. A function y = f(x) for which an increase in the value of 
x always results in an increase in the value of y that is, for which 
f(@) < f(x’) whenever x < 2’) is called a monotonic increasing function; 
if, on the other hand, an increase in the value of x always implies a 
decrease in the value of y, the function is called a monotonic decreasing 
function. Such functions are represented graphically by curves which 
always rise or always fall as x traverses the interval of definition toward 


Figure 1.18 Monotone functions. 


increasing values (see Fig. 1.18). A monotone function always maps 
different values of x into different y; that is, the mapping is one-to-one. 


Even and Odd Functions 


If the curve represented by y = f(x) is symmetrical with respect to 
the y-axis, that is, if rv = —a and z = a yield the same function value 


f(—2) = f(x) 


we call the function an even function. For example, the function 
y = x? is even (see Fig. 1.16). If, on the other hand, the curve is 
symmetrical with respect to the origin; that is, if 


f(—2) = -f (2), 


we say the function is an odd function; thus the functions y = 2, 
y = x? (see Fig. 1.17) and y = 1/2 (see Fig. 1.15) are odd. 
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7? 


0 
Figure 1.19 Graph of y > 2’. 


It is frequently helpful to consider the geometrical representation of 
an inequality. For example, the inequality y > x? is represented by 
the domain above the parabola y = x? (Fig. 1.19). The interior of the 
unit circle centered at the origin (Fig. 1.20) is described by the inequality 
ety< il. 

Often several inequalities describe more complicated regions with 
boundaries consisting of different pieces. Thus the “first” quadrant 
of the unit circle is described by the system of simultaneous inequalities: 


e+y<l, «>0, y>0. 
(See Fig. 1.21.) 


Figure 1.20 Graph of z? + y? <1. 
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Figure 1.21 Graphofz’?+y?<1,z>0,y>0. 


d. Continuity 
Intuitive and Precise Explanation 


The functions and graphs just considered exhibit a property of 
greatest importance in the calculus, that of continuity. Intuitively, 
continuity means that a small change in the independent variable x 
implies only a small change in the dependent variable y = f(x) and 
excludes a jump in the value of y: thus the graph consists of one 
piece. In contrast, a graph y = f(x) consisting of pieces separated by a 
gap at an abscissa 2, exhibits there a jump discontinuity. For example, 
the function! f(x) = sgn x defined by f(x) = +1 for z > 0, by f(x) = 
—1 for x < 0, and f(0) = 0 has a “jump discontinuity’? at z) = 0 
(see Fig. 1.22). 

The idea of continuity is implicit in the everyday use of elementary 
mathematics. Whenever a function y = f(x) is described by tables, 
such as the logarithmic or trigonometric tables, the values of y can be 
listed only for a “discrete” set of values of the independent variable 
x, say at intervals of 1/1000 or 1/100,000. Yet, unlisted values of the 
function may be needed for intermediate x. Then we tacitly assume 
that an unlisted value f(x) is approximately the same as that of f(x) 


1 Pronounced “‘signum’’ or ‘‘sign’’ of z. 

? Technically, the word “jump” refers only to the particular kind of discontinuity 
in which the function approaches values from the right and left that do not both 
agree with f(x). An “infinite” discontinuity is exhibited by the function y = 1/x 
for x Æ 0 and y = 0 for x = 0. Still other types of discontinuities will be discussed 
later. 
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Figure 1.22 The function f(x) = sgn z. 


for a neighboring x which appears in the table and that f(xy) can be 
approximated as precisely as we want if only the z-values in the table 
are spaced sufficiently close to each other. 

Continuity of the function f(x) for a value zx, just means that f(z) 
differs arbitrarily little from the value f(z) once x is sufficiently close 
to x9. The words “differs arbitrarily little” and “sufficiently close” are 
somewhat vague and must be explained precisely in quantitative terms. 

Prescribe any “margin of precision” or “tolerance,” that is, any 
positive real number e (however small). For continuity of f at x, we 
require that the difference between f(x) and f(x») stay within this 
margin, that is, that | f(z) — f(x»)| < e, for all values x which are 
sufficiently close to x (or for all values x lying within some distance 6 
from zo). 

We can visualize most easily what continuity means if we interpret f 
as a mapping assigning to points x on the x-axis images on the y-axis. 
Take any point xy on the x-axis and its image yọ = f (x) (see Fig. 1.23). 


J 
youme yo yore 
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I 
Figure 1.23 Continuity of the mapping y = f(x) at the point 2». 
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We mark off an arbitrary open interval J on the y-axis having the point 
Yo as center. If 2e is the length of J, then the points y of J are those 
whose distance from yọ is less than e or for which |y — y| < e. The 
condition for continuity of f(x) at x, is: All points x close enough to 
x) have images lying in J; or: It is possible to mark off an interval J 
on the z-axis with center zo, say the interval x) — 6 < x < 2 + ô such 
that every point x of J has an image f(x) which lies in J and thus 
| f(z) — f(%)| < €e. Continuity of f(x) at the point 2) means that for 
an arbitrary e-neighborhood J of the point yọ = f(x») on the y-axis a 
6-neighborhood 7 of the point zy on the x-axis can be found, all of whose 
points are mapped into points of J.! Of course, this makes sense only 
for points on the z-axis at which the mapping is defined, that is, which 
belong to the domain of f. Thus we are led to the following precise 
definition of continuity. 


The function f(x) is continuous at a point x, of its domain if for every 
positive € we can find a positive number ô such that 


If@) — fEl] < € 
for all values x in the domain of f for which |x — x9| < ò. 


Most useful is the geometric interpretation of continuity when we 
represent the function f by its graph in the zy-plane (see Fig. 1.24). 
Let Po = (£o, Yo) be a point on the graph. The points (x,y) with 
Yo — E€ < Y < Yo + € now form a horizontal “strip”? J containing Pp. 
Continuity of f at z means that given any such horizontal strip J, 
however thin, we can find a vertical strip Z given by x — ô < x < 
% + ô so thin that every point of the graph lying in Z also falls 
into J. 

As an illustration we consider the linear function f(x) = 5x + 3; 
we have 


IŒ) — f %o)| = (Se + 3) — (5z, + 3)| = 5 |x — zol, 


which expresses that the mapping y = 5x + 3 magnifies distances by 
the factor 5. Here obviously | f(x) — f(£ọ)| < e for all x for which 


1 In this definition of continuity Z and J are intervals having their centers respectively 
at the points xz, and Yo. This is convenient for the analytic definition of continuity 
at x, which refers to the distances |x — zo| and |y — y |, but it is somewhat 
artificial if we interpret f geometrically as a mapping. We could instead define 
continuity of y = f(x) at a point xy just as well by the requirement that for every 
open interval J on the y-axis which contains the point yo = f (x) we can find an open 
interval J on the x-axis containing the point x, such that the y-image of any point x 
in 7 for which the mapping is defined lies in J. The proof of the equivalence of the 
two definitions is left to the reader as a simple exercise. 
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|z — zo| < e/5. Consequently, the condition for continuity of f(z) 
at the point 2, is satisfied if we choose 6 = e/5 (but, of course, any 
positive number 6 < e/5 is also a possible choice); the image of any 
point of the interval x — ô < x < x + 6 will then lie in the interval 
Yo — e <Y < Y+ ©. In this example the statement that the distance 
ly — Yol is “arbitrarily small” for “sufficiently small” |x — 2)| can be 
given a quite specific meaning; indeed |x — zo| is sufficiently small if it 
does not exceed one-fifth of the value of |y — yol. 
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Figure 1.24 Continuity of y = f(z) at the point 2. 


Another example is furnished by the function f(x) = x7. Here we 
have for |r — ra| < ô 


[f (x) — f(x») = |x? — xil = |£ — Tol [2z + (£ — zo) 
< |x — zo| (2 |aol + |£ — zol) < 6(2 |] + ô). 
We verify immediately that the condition | f(x) — f(%)| < € is satis- 
fied if we choose 6 = — |al + Ve + Izl. 


Intuitively, the idea of continuity seems obvious without explanation, 
but the precise formulation may initially be somewhat difficult to 
grasp because of the permissiveness of words such as “‘one can find” or 
“arbitrarily chosen.” Yet the reader who may at first be well satisfied 
with some intuitive notion of continuity will gradually learn to 
appreciate the logical precision and generality of the analytic definition, 
the outcome of a long and persistent struggle for reconciliation of the 
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need for intuitive understanding with that of logical clarity. In the long 
run a precise meaning for the word “continuity” is indispensable; the 
analytic definition given here is the compelling formulation of an 
important property of functions. 

For the beginner it should be emphasized again that “small” is not 
an absolute designation of a number; rather the term “arbitrarily 
small” refers to a number that is not fixed at the outset but for which 
then any positive value may be chosen, and which is subject to a subse- 
quent smaller choice for a refined approximation of f (£o). “Sufficiently 
small” refers to a number ô that must be adjusted to suit a margin of 
tolerance set previously by another number e. 


Continuity and Discontinuity Explained by Examples. We can illumi- 
nate the definition of continuity by contrast with examples of dis- 
continuity, examples which do not fit the definition above. Recall the 
simple example of the function f(x) = sgn x on p. 31. Obviously, for 
any x, ~ 0 this function is continuous according to the e, 6-definition 
above, in fact, with a constant ô = |z,| no matter how small e is 
chosen. But for x) = 0 no 6 at all can be found if e is less than 1 since 
| f(z) — f(0)| = | f(x) = 1 > e for every x unequal to zero, however 
close x might be to zero. 

The function sgn v illustrates the simple type of discontinuity at a 
point £ known as jump-discontinuity, in which f(x) approaches limiting 
values from the right and left as x approaches €—limiting values, 
however, that differ either from each other or from the value of f at 
the point €.1 The graph at x = & then has a gap. Other curves with 
jump discontinuities are sketched in Fig. 1.25a and b; the definition 
of these functions should be clear from the figures. 

In discontinuities of this kind the limits from the right and the left 
both exist. We turn to discontinuities in which this is not the case. 
The most important of these are the infinite discontinuities or infinities. 


1 The precise definition of /imit will be given in Section 1.7; an intuitive idea is 
sufficient for the descriptive remarks made here. 

2 In all these examples of jump discontinuities the limits of the function at the point 
of discontinuity from the right and left have different values. The trivial example 
of the function f(x) defined by 


f(x) =0 for «#0, f(z) =1 for x=0 


illustrates a jump discontinuity in which the limits from both sides are equal to 
each other but differ from the value of f at the point of discontinuity ¢ itself. We 
have then a removable singularity. Here f can be made continuous by merely 
changing the value of fat € so as to agree with the limits from both sides. 
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(a) 


(b) 
Figure 1.25 


Figure 1.26 Graph of function with infinite 
discontinuity. 
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These are discontinuities like those exhibited by the functions 1/z or 
1/x? at the point z = 0; as x-+0 the absolute value | /(x)| of the 
function increases beyond all bounds. The function 1/2 increases 
numerically beyond all bounds through positive and negative values, 
respectively, as x approaches the origin from the right and from the 
left. On the other hand, the function 1/x? has for x = 0 an infinite 
discontinuity at which the value of the function increases beyond any 
positive bound as x approaches the origin from both sides (cf. Fig. 1.26 


J 


Figure 1.27 Function with infinite discontinuities. 


and Fig. 1.27). The function 1/(z? — 1) shown in Fig. 1.27 has infinite 
discontinuities both at x = 1 and at x = —1. 

An example of another type of discontinuity in which no limit from 
the right or from the left exists is the “piecewise linear” even function 
y = f(x) illustrated in Fig. 1.28, which is defined as follows for all nonzero 
values of x. This function alternately takes the values +1 and —! for the 
a-values of the form +1/2”, where n is any integer: f(+1/2") = (—1)". 
In every interval 1/2"+1 < x < 1/2" or —1/2" < x < —1/2"* the func- 
tion f(x) is linear and ranges over all values between —1 and +1. 
Therefore the function swings backward and forward more and more 
rapidly between the values —1 and +1 as x approaches nearer and 
nearer to the point x = 0, and in the immediate neighborhood of that 
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Figure 1.29 Oscillating function with discontinuity. 
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point an infinite number of such oscillations occur. A similar behavior 
is exhibited by the smooth curve (Fig. 1.29). [Here f(x) actually is given 
by an expression in closed form, namely, f(x) = sin (1/x), with the 
sine-function defined appropriately as on p. 51]. 
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Figure 1.30 Continuous oscillating function. 


A contrast to this example is the piecewise linear function y = f(x) 
that takes the values f(+1/2”) = (—4)" for all integers n (see Fig. 1.30) 
and is linear for intermediate values of z. Here f(x) remains continuous 
at the point z = 0 if we assign to it the value 0 at that point. In the 
neighborhood of the origin the function oscillates backward and 
forward an infinite number of times, but the magnitude of these 
oscillations becomes arbitrarily small as the origin is approached. The 
situation is the same for the function y = 7z sin (1/z) (see Fig. 1.31). 
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These examples show that continuity permits all sorts of remark- 

able possibilities foreign to our naive intuition. 

*Removable Discontinuities 


As noted it may happen that at a certain point say x = 0, a function 
is not defined by the original law, as, for example, in the last examples 
discussed. We are then free to extend the definition of the function by 


Figure 1.31 Continuous oscillating function. 


assigning to it any desired value at such a point. In the last example 
we can choose the definition in such a way that the function becomes 
continuous at that point also, namely, by choosing y = Oat z =0. A 
similar continuous extension can be defined whenever the limits from 
the left and from the right both exist and are equal to one another; 
then we need only make the value of the function at the point in question 
equal to these limits in order to make the function continuous there. 
Whatever discontinuity may be imposed by definition at x = 0, this 
discontinuity is “removable” by assigning a suitable value f(0). For 
the function y = sin 1/z or for the function in Fig. 1.28, this is, however, 
not possible: whatever value we assign to the function at x = 0, the 
extended function is discontinuous. 
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Modulus of Continuity. Uniform Continuity. Our definition of con- 
tinuity of the function f(x) at x) requires that for every degree of 
precision e > 0 there exist quantities ô > 0 (so-called moduli of con- 
tinuity) such that | f(x) — f(%»)| < € for all x in the domain of f for 
which |x — 2| < ô. A modulus of continuity expresses information 
about the sensitivity of f to changes in x. A modulus of continuity ô 
is never unique; it can always be replaced (for the same z, and e) by 
any smaller positive quantity 6’ since |x — x| < 6’ implies |x — zo| < ô 
and thereby | f(x) — f(%9)| < €e. For practical purposes, as in numer- 
ical computations, we may be interested in a particular choice of 0; 
for example, in the largest value for 6. On the other hand, if we 
merely want to establish the fact that fis continuous at x, then we need 
only to exhibit any one modulus of continuity for every positive e. 

In general, as our examples show, this ô = 6(€) depends not only on e 
but also on the value of x). Of course, we need not consider all positive 
values e. We can always restrict considerations to sufficiently small e, 
say to e < €, for an arbitrarily chosen eo, since for € > e we can use 
the same modulus of continuity as for € = €9. Similarly, we only have 
to take into account the points x of the domain of flying in an arbitrary 
neighborhood of x», say those with |x — zo| < 69, since we can always 
replace any modulus of continuity 6 by a smaller one which does not 
exceed 69. Continuity of f at x9 is a local property, meaning a property 
which only depends on the values of f in some neighborhood of xo 
however small. 

As we have seen, the function f may be continuous for some x and 
discontinuous for others. A function is called continuous in an interval 
if it is continuous at each point of the interval. For each z, of the 
interval we have then a modulus of continuity ô = d(e) which can be 
expected to vary with 2, reflecting the different rates at which y changes 
with changing x near different points 2p. 

We call f uniformly continuous in an interval if we can find a uniform 
modulus of continuity 6 = 6(€) for that interval, that is, one not depend- 
ent on the particular point x, of the interval. Thus f(z) is uniformly 
continuous in an interval! if for each positive e there exists a positive 
number ô such that | f(x) — J t£o)| < € for any two points x and z, of 
the interval for which |z — zo| < ô. 

For a uniformly continuous function y = f(x) the values of y differ 
“arbitrarily little” from each other for any values of x that are “suffi- 
ciently close” regardless of their location in the interval. In some respects 


1 In this definition the word interval can refer either to closed or open or infinite 
intervals. 
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uniform continuity comes closer to intuitive notions than the mere local 
property of continuity. 

For example, the function f(x) = 5x + 3 is uniformly continuous 
for all values of the independent variable since here | f(x) — f(x] = 
5 |£ — zo| < e for |x — zo| < €/5, and thus d(e) = e/5 represents a 
uniform modulus of continuity. 

The function f(z) = 2? for an infinite z-interval is definitely not 
uniformly continuous. It is clear that small changes in 2 can produce 
arbitrarily large changes in x? if only x is large enough. A glance ata 
table of squares of integers x shows how successive squares are spaced 
further and further apart as x increases. If, however, we only consider 
pairs of values x and x, belonging to a fixed finite closed interval [a, 5], 
we can find a uniform modulus of continuity. Indeed, for |z — 2 | < 6 
we have 


IS) — fEl = |2? — xl = |£ — zol |£ + zol S 2 |e — xol([b] + lal) 
< 26(|b| + lal) = € 


if we take 6 = e/2(|b| + fal). 

A similar situation prevails for the function f(x) = 1/x for x # 0, 
f(0) = 0. Consider a closed bounded interval a < x < b throughout 
which the function is continuous. Such an interval cannot include the 
origin, which is a point of discontinuity, so that a and b must have the 
same sign. Suppose a and b are both positive. Then for x and zy 
belonging to the interval and for |x — 2 | < 6 we have 


i [zol |z] ` a? 


for 6 = a?e. Thus the function is uniformly continuous in the interval 
[a, b]. Of course, this proves also that the function f(x) = 1/z is con- 
tinuous at every point xọ > 0. For every such value x, can be enclosed 
in some interval a < x) < b with positive a,b. The expression 6 = a?e 
is then a modulus of continuity for the function at x, if we restrict x 
to a neighborhood of x, lying completely in the interval. 

The continuous functions of the preceding examples turn out to be 
uniformly continuous in any closed bounded interval belonging to 
their domain. They illustrate a general fact which will be proved in 
the Supplement, p. 100. 


Any function, continuous in a closed and bounded interval, automatically 
is uniformly continuous in that interval. 
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The restriction to bounded intervals is essential as the example of the 
function x? shows. Similarly, we must stipulate that the interval be 
closed; for example, the function y = 1/z is continuous in the open 
interval 0 < x < 1 but is not uniformly continuous there; arbitrarily 
large changes in y can be produced by arbitrarily small changes in x 
if only x is sufficiently close to the origin. If there existed a uniform 
modulus of continuity d(e) for the interval (0, 1), we could take, for 
example, x) < 6,2 = $29; obviously then |1/x — 1/2x9| = 1/2» is greater 
than any preassigned e whenever 2, is sufficiently small, so that the 
assumption of a uniform (e) leads to a contradiction. 


Lipschitz-Continuity—H6lder-Continuity. In the preceding examples 
of functions uniformly continuous in an interval [a,b] we found a 
particularly simple modulus of continuity, namely (e) proportional 
to «. This most common situation is presented by the so-called 
Lipschitz-continuous functions, that is, by the functions f(x) which 
satisfy an inequality of the form 


| f (x2) — f(x)| < L |x, — zl 


(a so-called Lipschitz condition) for all x,, x, in the interval with a fixed 
value L. Lipschitz-continuity means that the ‘‘difference quotient” 


f(z) — f(%) 


Ly — Tı 


formed for any two distinct points of the interval never exceeds a fixed 
finite value L in absolute value or that the mapping y = f(x) magnifies 
distances of points on the x-axis at most by the factor L. Clearly, for a 
Lipschitz-continuous function the expression (€) = e/L is a modulus of 
continuity since | f(z.) — f(x,)| < € for |z — x,| < e/L. Conversely, 
any function with a modulus of continuity proportional to e, say 
ô(€) = ce, is Lipschitz-continuous, with L = 1/c. 


As we shall see in Chapter 2 most of the functions encountered are 
Lipschitz-continuous except at isolated points, as a consequence of the 
fact that their derivatives are bounded in any closed interval which excludes 
these points. However, Lipschitz-continuity is only sufficient but not nec- 
essary for uniform continuity. The simplest example of a function which is 


continuous without being Lipschitz-continuous is given by f(x) = V x for 
x > Oand x, = 0. Here the difference quotient 


f@) -fO) _ 1 


x— 0 Va 
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becomes arbitrarily large for sufficiently small x and hence cannot be bounded 
by a fixed constant L. Thus it is not possible to choose 6(€) proportional to 
e; but there exist other, nonlinear, moduli of continuity for this function, for 
example, d(e) = eè. 

The function Vx belongs to the general class of functions called ‘‘Hélder- 
continuous,” satisfying a ‘‘Holder-condition” 


lf (x2) — fæ) <L |£ = xl" 


for all x,, z3 of an interval, where L and « are fixed constants, the ‘‘Holder- 
exponent” « being restricted to values 0 < « < 1. The Lipschitz-continuous 
functions arise for the special Holder-exponent « = 1. 

Obviously, 6 = L~'/2e!/2 is a possible modulus of continuity for a Hélder- 
continuous function f; here 6 is proportional to el/a and not to e itself. 
The function f(x) = Vx is Holder-continuous with exponent « = 4. This 
follows from the inequality 


| Vy z ZA < |z; — rl, 
which we obtain by observing 
| Vz — Vary] < |V, + Val 


and multiplying by | Vz, — Vz,|. This yields the modulus of continuity 
d(e) = eè for V x as mentioned before. 

More generally, the fractional powers f(x) = x* for0 < « < l are Holder- 
continuous with Holder-exponent «. 

The Ho6lder-continuous functions still do not exhaust the class of all 
uniformly continuous functions. It is not difficult to construct examples 
of continuous functions for which powers of e do not suffice as moduli of 
continuity. (See Problem 13, p. 118.) 


e. The Intermediate Value Theorem. Inverse Functions 


Intuitively there is no doubt that a function which is continuous, and 
hence has no “jumps,” cannot vary from one value to another without 
passing through all intermediate values. This fact is expressed by the 
so-called intermediate value theorem (its precise proof is given in the 
Supplement, p. 100). 


INTERMEDIATE VALUE THEOREM. Consider a function f(x) contin- 
uous at every point of an interval. Let a and b be any two points of the 
interval and let n be any number between f(a) and f(b). Then there exists 
a value È between a and b for which f(E) = n. 


Interpreted geometrically, the theorem states that if two points 
(a, f(a)) and (b, f(6)) of the graph of a continuous function f lie on 
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different sides of a parallel y = to the z-axis, then the parallel 
intersects the graph at some intermediate point (see Fig. 1.32). There 
may, of course, be several intersections. In the important case where the 
function f(x) is monotonic increasing or monotonic decreasing through- 
out the interval, there can be only one intersection for then f cannot 
have the same value 7 for two different values of &. 

As an example we take the function f(x) = x? which is monotonic 
increasing and continuous in the interval 1 < x < 2. Here f(1) = 1, 
JQ) = 4. Taking for 7 the value 2 intermediate between 1 and 4 we find 


Figure 1.32 The intermediate value theorem. 


that there exists a unique ë between | and 2 for which ¢* = 2. This is, 
of course, the number denoted by J2. 


Continuity of the Inverse Function 


For any monotonic increasing continuous function f(x) defined in an 
interval a < x < b, we found that for every 7 with f(a) < n < f(b) 
there is exactly one € with a<&<b for which f(E)= n.} Let 
a = f(a), p = f(b). Since & is determined uniquely by 7, it represents 
a function & = g(n) defined for arguments 7 in the closed interval 
[a, B]. We call this function g the inverse of f. Since larger correspond 
to larger n = f(&), the function g is again monotonic increasing. It is 
easy to show that the inverse function g is also continuous. 


1 The intermediate value theorem as stated assigns Ẹ for 1 in the open interval 
f(x) <n < f(b). However, of course, for n = f(a) or 1) = f(b) we have only to 
take £ =aoré=b. 
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Figure 1.33 Continuity of the inverse of a monotonic continuous function. 


y 


Figure 1.34 Inverse functions. 
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Indeed, let n be any value between « and £ (see Fig. 1.33). Then £ = g(n) 
must lie betweeh a = g(a) and b = g(f8). Let « be a given positive number 
which we can assume to be so small that a < ë —e < E + e < b. We must 
show that |g(y) — g(n)| < « for all y sufficiently close to 7. Since f is in- 
creasing, n = f(¢) lies between the values f(Ẹ — «) = A and f(§ + «) = B 
and we can find a ô so small that 


A<n-d<y7+6<B. 


If y is any value with n — ô <y <n +ô and x = g(y), we have A < 
y <B and hence g(A) < g(y) < g(B), that is, € — «e <ge(y) <E +e or 
lg(y) — g(n)| < «. The same proof, modified slightly, applies when 7 is one 
of the end points « or £ of the interval of definition of g. 


The relations y = f(x) and x = g(y) are equivalent and are repre- 
sented by the same graph in the z,y-plane; the points (x, y) in the plane 
for which y = f(x) are the same as those points for which x = g(y). 
If we represent the function g in the customary way by y = g(x), we 
must interchange x and y; then the graph of y = g(x) is obtained from 
the graph of y = f(x) by taking the mirror image with respect to the 
line y = x. An example is given by the graphs of the function f(x) = 2? 
for x > 0 and of the inverse function g(x) = Vz for x > 0 (see Fig. 
1.34). 


1.3 The Elementary Functions 


a. Rational Functions 


We turn to a brief review of the familiar elementary functions. The 
simplest types of function are constructed by repeated application of 
the elementary operations, addition and multiplication. If we apply 
these operations to an independent variable x and to a set of real 
numbers a,,..., 4a, we obtain the polynomials 


Y=A tarts: +a,2". 


Polynomials are the simplest functions of analysis and in a sense the 
basic ones. 
Quotients of such polynomials, of the form 


i SOO a 
bot biz tees + bpa’ 


are the general rational functions; these are defined at all points where 
the denominator differs from zero. 
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The simplest polynomial, the linear function 
y =ar + b, 
is represented graphically by a straight line. Every quadratic function 
y = ax + br +c 


is represented by a parabola. The graphs of polynomials of the third 
degree 
y = ar? + bẹ? + cx +d, 


are occasionally called parabolas of the third order, etc. 


y 


Figure 1.35 Powers of x. 


The graphs of the function y = z” for the exponents n = 1, 2, 3, 4 
are given in Fig. 1.35. For even values of n the function y = x” satisfies 
the equation f(—x) = f(x), and is therefore an even function, whereas 
for odd values of n the function satisfies the condition f(—z) = —f (x), 
and is therefore odd. 

The simplest example of a rationa! function which is not a polynomial 
is the function y = 1/2 mentioned on p. 27; its graph is a rectangular 
hyperbola. Another example is the function y = 1/2x* (cf. Fig. 1.26, 
p. 36). 
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b. Algebraic, Functions 


We are at once forced out of the set of rational functions by the 
problem of forming their inverses. The typical example of this is the 
function Wx, the inverse of x". The function y = x” for x > 0 is 
easily seen to be monotonic increasing and continuous. It therefore 
has a single-valued inverse, which we denote by the symbol x = Vy, 
or, interchanging the letters used for the dependent and independent 
variables, 

y = Wx = ohn, 


By definition this root is always nonnegative. For odd values of n the 
function x” is monotonic for all values of x, including negative values. 


Consequently, for odd values of n we may extend the definition of Wx 
uniquely to all values of x; in this case Wx is negative for negative 


values of x. 
More generally, we may consider 


y = V R(2), 


where R(x) is a rational function. Further functions of similar type 
are formed by applying rational operations to one or more of these 
special functions. Thus, for example, we may form the functions 


y= Vrt+ VEHI, yout Ve +1. 


These functions are special cases of algebraic functions. (The general 
concept of an algebraic function will be defined in Volume IT.) 


c. Trigonometric Functions 


The rational functions and the algebraic functions are defined directly 
by the elementary operations of calculation, but geometry is the source 
from which we first draw examples of other functions, the so-called 
transcendental functions... Of these we consider here the elementary 
transcendental functions, namely, the trigonometric functions, the 
exponential function, and the logarithm. 

In analytical investigations angles are not measured in degrees, 
minutes, and seconds, but in radians. We place the angle to be measured 


1 The word “transcendental” does not mean anything particularly deep or myste- 
rious; it merely suggests that the definition of these functions transcends the 
elementary operations of calculations, “quod algebrae vires transcendit.” 
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H— COSx = 


Figure 1.36 The trigonometric functions. 


with its vertex at the center of a circle of radius 1, and measure the size 
of the angle by the length of the arc of the circumference cut out by the 
angle.! Thus an angle of 180° is the same as an angle of r radians 
(has radian measure 7), an angle of 90° has radian measure 7/2, an 
angle of 45° has radian measure 7/4, an angle of 360° has radian measure 
27. Conversely, an angle of 1 radian expressed in degrees is 


180° 


TT 


f or approximately 57° 17’ 45”. 


Henceforth, whenever we speak of an angle x, we shall mean an 
angle whose radian measure is 2. 
y 


y = Sinx 


Figure 1.37 


1 The radian measure of an angle can also be defined as twice the area of the corre- 
sponding sector of the circle of radius one. 
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We briefly recall the meaning of the trigonometric functions sin 7z, 
cos x, tan x, cot x.’ They are shown in Fig. 1.36, in which the angle x 
is measured from the segment OC (of length 1), angles being reckoned 
positive in the counterclockwise direction. The functions cos x and 
sin x are the rectangular coordinates of the point A. The graphs of 
the functions sin x, cos x, tan x, cot x are given in Figs. 1.37 and 1.38. 


Later (see p. 215) we will be able to replace the geometrical definitions 
by analytical ones. 
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d. The Exponential Function and the Logarithm 


In addition to the trigonometric functions, the exponential function 
with the positive base a, 


y = @, 


and its inverse, the logarithm to the base a, 
x = log, Y, 


1 It is also sometimes convenient to introduce the functions secs = 1/cosz, 
cosec x = 1/sin z. 
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are also included among the elementary transcendental functions. In 
elementary mathematics it is customary to pass over certain inherent 
difficulties in their definition, and we too shall postpone the detailed 
discussion of them until we have better methods at our disposal 
(cf. Section 2.5, p. 145). We can, however, at least indicate here one 
“elementary” way of defining these functions. If x = p/q is a rational 
number (where p and q are positive integers), then—the number a being 
assumed positive—we define a” as Ja” = a?!?, where the root, accord- 
ing to convention, is to be taken as positive. Since the rational values 
of x are everywhere dense, it is natural to extend this function a” to a 
continuous function defined for irrational values of x as well, giving 
to a” when z is irrational, values which are continuous with the values 
already defined when z is rational. This defines a continuous function 
y = a’, the “exponential function,” which for all rational values of x 
gives the value of a” found above. That this extension is actually 
possible and can be carried out in only one way we take for granted at 
the moment; but it must be borne in mind that we still have to prove 
that this is so. 

The function 


x = log, y 


can then be defined for y > 0 as the inverse of the exponential function: 
x = log, y is that number for which y = a”. 


e. Compound Functions, Symbolic Products, Inverse Functions 


New functions are frequently formed not only by combining known 
functions by rational operations but by the more general and basic 
process of forming functions of functions or compound functions. 

Let u = ġ(x) be a function whose domain is in the intervala < x < b 
and whose range lies in the interval « < u < f. Moreover, let y = g(u) 
be a function defined for «a < u < f. Then g(¢(x)) = f(x) defines a 
function f for a < x < b which is “compounded” or “composed” from 
gand ¢. For example, f(z) = 1/(1 + zx?”) is composed of the functions 
p(x) = 1 + z?” and g(u) = 1/u. Similarly, the function f(z) = sin(1/z) 
is composed of ¢(x) = 1/x and g(u) = sin u. 

It is useful to interpret the compound functions in terms of mappings. 
The mapping ¢ takes every point x of the interval [a, b] into a point u 
in the interval [«, 8]; the mapping g takes any value u in [«, p] into a 
point y. The mapping f is the “symbolic product” gd of the mappings 


This is done on p. 152. 
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g and 4, that is, the mapping carrying out ¢ and g successively, in that 
order; for any x in [a, b] we form its map u under the mapping ¢, and 
then apply g to the image u = ¢(x), obtaining g(¢(x)) = f(x) = y (see 
Fig. 1.39). Such a symbolic product gø is natural and meaningful 
for any type of operation; it signifies that we first perform ¢, and 
then, on the result, perform g.1 We must not confuse the symbolic 
product gd = g(¢) of two functions with the ordinary algebraic product 
g(x): (x) of the functions, in which both g(x) and ¢(x) are formed 
for the same argument x (the mappings applied to the same point) 
and the product of the values of the functions is formed. 

Naturally, symbolic products cannot be expected to be commutative. 
In general, g(ġ) and ¢(g) are not the same, even where both are defined; 


y 


go 


x 
Figure 1.39 Symbolic product gġ = f of two mappings. 


the order in which operations are performed matters very much. If, 
for example, ¢ stands for the operation of “adding | to a number” 
and g for the operation of “multiplying a number by 2,” then 


e(d(x)) = Ar+1)=2r4+2, (g(x) = (2x) +1 = 2741. 


(See Fig. 1.40.) 

In order to be able to form the symbolic product g¢ of two mappings, 
the “factors” g and ġ must fit together in the sense that the domain of 
g must include the range of ¢; thus we cannot form gd when 


g(u)y=Vu, and (x)= —1 — zr. 


1 That the product gġ corresponds to first carrying out ¢ and then g (in that order) 
seems unnatural at first glance, but actually corresponds to the convention always 
adopted in mathematics of writing the argument z of a function f(z) to the right of 
the symbol f for the function. Thus, for example, in sin (log z) it is always under- 
stood that we first form the logarithm of x and then take the sine of that, and not the 
other way around. 
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It is useful to consider functions which are compounded more than 
once. Such a function is 
fix) = V1 + tan (2%), 
which can be built up by successive compositions 
d@=x,  yp)=1+tang, gy) = Vy =f). 
We would write symbolically f = gyd¢. 


2x41 2(x+1) 


Figure 1.40 Noncommutativity of mappings. 


Inverse Functions 


The notion of “inverse function” becomes clearer in the context of 
product of mappings. Consider the mapping ¢ associating with a 
point x of the domain of ¢@ the image u = ¢(x). Assume that our 
mapping ¢ is such that different z are always mapped into different u. 
The mapping is then called “one to one.” Then a value u is the image 
of at most one value x. We can associate with every u in the range of 
the value x = g(u) of which u is the image under the mapping @¢. 
In this way we have defined a mapping g whose domain is the range of 
$ and which when applied to an image u = ¢(x) of the ¢-mapping 
reproduces the original value x, that is, g(@(z)) = 2. We call g the 
inverse of $. It is characterized by the symbolic equation g@x = z. 


The Identity Mapping 


We define the identity mapping I as the one that maps every x into 
itself; for the inverse g of $ then, g@ = /.1 The mapping / plays the same 
role for symbolic multiplication as the number | in ordinary multiplica- 
tion; multiplication by / does not change a mapping. Accordingly, 
the equation gf = I suggests the notation g = $~ for the inverse of ¢. 
For example, the inverse x = arc sin u of the function u = sin is 
often denoted by x = sin u.? 


1 More precisely g¢ agrees with /, in the domain of ¢. 
? This must not be confused with the algebraic reciprocal 1/(sin u). 


Sec. 1.4 Sequences 55 


From the definition of the inverse g of ¢ it follows immediately that 
also ¢ is the irtverse of g so that not only g(¢) = x but also d(g(u)) = u. 


* A monotone function u = ¢(«) defined in an interval a < x < b clearly 
defines a 1-1 mapping of that interval. If, in addition, ¢ is continuous, then 
as we Saw earlier as a consequence of the intermediate value theorem (p. 44), 
the range of ¢ is the interval with end points ¢(a) and ¢(b). In that case the 
inverse g of ¢ exists and is again monotone and continuous in that latter 
interval. As a matter of fact the monotone continuous functions are the only 
continuous functions that have inverses or define one-to-one mappings. Indeed, 
let u = ¢(x) be a continuous function in the closed interval [a, b] mapping 
different x of the interval into different u. Then in particular the values 
¢(a) = a and (b) = $ are distinct. We assume, say, that « < 8. Then we 
can show that ¢(x) is monotonic increasing throughout the interval. For if 
that were not the case we could find two values c and d witha <c <d <b 
for which ¢(d) < ¢(c). If here also ¢(d) > ¢(a) it would follow from the 
intermediate value theorem that there exists a & in the interval [a, c] for which 
¢(£) = ¢(d). This € would be different from d and our mapping could not 
be 1-1. If, on the other hand, ¢(d) < (a) = « it would follow that (a) is 
intermediate between ¢(d) and ¢(b); there would then be a & intermediate 
between d and b for which ¢(&) = ¢(a), and this also contradicts the 1-1 
nature of ¢. 


An important, almost obvious property of compound functions, is 
that g(P(x)) = f(x) is continuous (where defined) if g and ¢ are. Indeed, 
for given positive « we have 


IŒ) — f(@o)| = IZE) — g(P(%))l << for IAr) — e)l < 6 


as a consequence of the continuity of the function g. Since, however, 
¢ is also continuous, we certainly have |¢(x) — ¢(%9)| < ô for all x 
satisfying |z — x| < 6’ with some suitable positive 6’. Hence 


IŒ) — f(a)| < € for |r — x| < 0 


which shows the continuity of f. 

It is much easier to appeal to this general theorem in proving con- 
tinuity of compound functions like V1 — 2? than to try to construct 
directly a modulus of continuity for the function. 


1.4 Sequences 


Hitherto we have considered functions of a continuous variable, 
or functions whose domains consist of one or more intervals. How- 
ever, numerous cases occur in mathematics in which a quantity a 


56 Introduction Ch. 1 


depends on a positive integer n. Such a function a(n) associates a 
value with every natural number n. The function a(n) is called a 
sequence, specifically, an infinite sequence, if n ranges over all positive 
integers. Usually, we write a,„! instead of a(n) for the “nth element” of 
the sequence, and think of the elements forming a sequence arranged 
in order of increasing subscripts n: 


Q1, Ag, Ag, 22s 


Here the dependence of the numbers a, on n may be defined by any 
law whatsoever, and, in particular, the values a,, need not all be distinct 
from each other. The idea of a sequence will most easily be grasped 
by examples. 


1. The sum of the first n integers 
Say) =14+243+-:: +n = hn(n + 1) 
is a function of n, giving rise to the sequence 
136510155, vas 


2. Another simple function of n is the expression “‘n-factorial,” 
the product of the first n integers. 


3. Every integer n > 1 which is not a prime number is divisible by 
more than two positive integers, whereas the prime numbers are 
divisible only by themselves and by 1. We can obviously consider the 
number T(n) of divisors of n as a function of n itself. For the first few 
numbers it is given by the table: 


n=1 2 3 4 


5678 9 10 11 12 
Tín) =1 22324243 


4 2 6 


4. A sequence of great importance in the Theory of Numbers is 
m(n), the number of primes less than the number n. Its detailed 
investigation is one of the most fascinating problems. The principal 
result is: The number z(7) is given asymptotically,” for large values of n, 
by the function n/log n, where by log n we mean the logarithm to the 
“natural base” e, to be defined later (p. 77). 


1 Pronounced “‘a-sub-n.”’ 
2 That is, the quotient of the number 7(^) by the number n/log n differs arbitrarily 
little from one, provided only that n is large enough. 
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1.5 Mathematical Induction 


We insert here a discussion of a very important type of reasoning 
which permeates much of mathematical thought. 

The fact that the whole sequence of natural numbers is generated by 
starting with the number | and passing from n to n + | leads to the 
fundamental “principle of mathematical induction.” In the natural 
sciences we derive by “empirical induction” from a large number of 
samples, a law which is expected to hold generally. The degree of 
certainty of the law depends then on the number of times a sample 
or an “event” has been observed and the law confirmed. This type of 
induction can be overwhelmingly convincing, although it does not 
carry with it the logical certainty of a mathematical proof. 

Mathematical induction is used to establish with logical certainty the 
correctness of a theorem for an infinite sequence of cases. Let A 
denote a statement referring to an arbitrary natural number n. For 
example, A might be the statement “The sum of the interior angles in a 
simple polygon of n + 2 sides is n times 180°” or nm. To prove a 
statement of this type it is not sufficient to prove it for the first 10 or 
the first 100 or even the first 1000 values of n. Instead, we have to 
apply a mathematical method which we explain first for this example. 
For n = | the polygon reduces to a triangle, for which the sum of the 
angles is known to be 180°. For a quadrangle corresponding to 
n = 2 we draw a diagonal dividing the quadrangle into two triangles. 
This shows that the sum of the angles of the quadrangle is equal to the 
combined sum of the angles of the two triangles, that is, 180° + 180° = 
2- 180°. Proceeding to the example of a pentagon we can divide this 
into a quadrangle and a triangle by drawing a suitable diagonal. This 
yields for the sum of the angles of the pentagon the value 2 - 180° + 
1 - 180° = 3- 180°. We can go on in this manner and prove the 
general theorem successively for n = 4,5, etc. The correctness of the 
statement A for any n follows from its correctness for the preceding n; 
in this way its general validity is established for all n. 


General Formulation 


What is essential in the proof of statement A in our example is that A 
is proved successively for the special cases A,, Ap,...A,,.... The 
possibility of doing this depends on two factors: (1) a general proof 
has to be given showing that the statement A,,, is correct whenever A, 
is correct and (2) the statement A, must be proved. That these two 
conditions are sufficient to prove the correctness of all Aj, Ap, Az,... 
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constitutes the principle of mathematical induction. In what follows we 
accept the validity of this principle as a basic fact of logic. 


The principle can be formulated in a more general abstract form. “Let S 
be any set consisting of natural numbers which has the following two 
properties: (1) whenever S contains a number r, then it also contains the 
number r + 1 and (2) S contains the number 1. Then it is true that S is the 
set of all natural numbers.” The previous formulation of the principle of 
mathematical induction follows if we take for S the set of all natural numbers 
for which statement 4 is correct. 

Often the principle is applied without specific mention or its use is indicated 
only by the expression, “etc.” This happens particularly often in elementary 
mathematics. However, in more complicated situations an explicit appeal to 
the principle is preferable. 


Examples. Two applications follow as illustrations. 

First we prove a formula for the sum of the first n squares. By some 
trial we find for small n, (say n < 5), that the following formula,’ 
denoted by 4,, holds: 

_ n(n + 1)(2n + 1) 


CEPLIS tpn z 


We conjecture that this formula is correct for all n. For the proof we 
assume that r is any number for which the formula A, is correct, that 
is, that 
Po 2ggta-- gpa tr H). 
: 6 3 
adding (r + 1)? to both sides, we obtain 


rir + 1)(2r + 1) 
6 


_ (r+ ir t WG + 1) +1) 
6 


PHP + +r +(r +1% 


This, however, is just the statement A,,, obtained by substituting 
r+ l for nin A,. Thus the truth of A, implies that of A,,,. To 
complete the proof of A, for general n we need only to verify the 
correctness of A,, that is, of 


1 Incidentally, this result was used by the Greek mathematician Archimedes in 
his work on spirals. 
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Since this is obviously correct, the formula A, is established for all 
natural n. ` 


The reader should prove by a similar argument that 


2 
PEPES +m TEED) f 


As a further illustration for the principle of induction we prove 


THE BINOMIAL THEOREM. The statement A,, of the theorem is repre- 
sented by the formula 


(a ae b)” = q” + a" tb ae MA mth 


nS OS 2) sgp E E es 2 iil 
1-2-3 1:-2°3:- +++ e 


ae b”. 


It is customary to write the formula in the form 


worn (gje (herve (eee Cy 


1 n 


where the binomial coefficient (r) is defined by 


(") = eee n! 
k k! k\(n — k)! 


fork =1,2,...,n—1 and 


d-t- 


(If we define 0! = 1, the general formula for (e) applies also to the 
cases k = 0 and k = n.) 

If A, holds for a certain n, we find by multiplying both sides with 
(a + b) that 


aromes oflje ajte (el 
-ge Ge [0 +r 
e [pe oe 


60 Introduction Ch. 1 


Ut) + (a 


atn- D Ga kt) nanl kt Dah 
k! (k + 1)! 


= Ma Dn DEED (; 4 k) 
k! k+1 


= tin DG ake (nt!) 
(k + 1)! k+1/ 


Since ("| = (" ') = | and (”) = (” = L) = 1, we have 
n 


0 0 n+l, 
(a + by = (" : ‘an + (” ')a"b + (” ariy +o 


d (” + ‘ab + (" + l)p, 
n n+l 


which is the formula 4,,,. Since also for n = 1 


esm- (ee (peer 


the binomial theorem holds for all natural numbers n. 


1.6 The Limit of a Sequence 


The fundamental concept on which the whole of mathematical 
analysis ultimately rests is that of the /imit of an infinite sequence a,. 
A number a is often described by an infinite sequence a, of approxi- 
mations; that is, the value a is given by the value a, with any desired 
degree of precision if we choose the index n sufficiently large. We have 
already encountered such representations of numbers a as “limits” of 
sequences in their representations as infinite decimal fractions; the 
real numbers then appeared as limits for increasing n of the sequences 
of ordinary decimal fractions with n digits. In Section 1.7 we shall 
give a precise general discussion of the limit concept; at this point we 
illustrate the idea of limit by some significant examples. 

Sequences a;, 4z, . . . can be depicted conveniently by a succession of 
“blocks,” the element a, corresponding to the rectangle in the zy-plane 
bounded by the lines z = n — 1, z =n, y=a,, Y = 0, having |a,| 
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as area,’ or equivalently, by the graph of a piecewise constant function 
a(x) of a centinuous variable x with jump discontinuities at the 
points x = n. 


We consider the sequence 


P. d 

a ane ae . 

(See Fig. 1.41.) No number of this sequence is zero; but as the number 
n grows larger, a„ approaches zero. Furthermore, if we take any 


1 


l 
Figure 1.41 The sequence a, = —. 
n 


interval centered at the origin, no matter how small, then from a 
definite index onward all numbers a, will be in this interval. This 
situation is expressed by saying that as n increases the numbers a, 
tend to zero or that they possess the /imit zero or that the sequence 
Ai, A2, A3, .. . converges to Zero. 
If the numbers are represented as points on a line, this means that 
the points 1/n crowd closer and closer to the point zero as n increases. 
The situation is similar for the sequence 
] (= 


F 

ae ane ee ne 

(See Fig. 1.42.) Here too, the numbers a, tend to zero as n increases; 
the only difference is that the numbers a, are sometimes greater and 
sometimes less than the limit zero; as we say, the sequence oscillates 
about the limit. 


1,- 


1 We might just as well have chosen the rectangle bounded by the lines x = n, 
x =n +1, Yy = an, Y =0 to represent an. 
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cyo 


Figure 1.42 The sequence a, = 
n 


The convergence of the sequence to zero is usually expressed sym- 
bolically by the equation 
lima, = 0, 


n> 


or occasionally by the abbreviation 


l 
b. dan = — ; azmi = 
In the preceding examples, the absolute value of the difference 
between a, and the limit steadily becomes smaller as n increases. This 


is not necessarily the case, as is shown by the sequence 
2 


(see Fig. 1.43) given for even values n = 2m by a, = asm = l/m; 
for odd values n = 2m — 1 by a, = amı = 1/2m. This sequence 


¿1111 
s d; H a n ’ gos 


D |m 


1 1 
Figure 1.43 The sequence az, = — , Qan_-1 = z. 
n 2n 
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also has the limit zero; for every interval about the origin, no matter 
how small, contains all the numbers a, from a certain value of n 
onward; but it is not true that every number lies nearer to the limit 
zero than the preceding one. 


n 
a, = —— 
n+ 1 


We consider the sequence 


2 n 
s d = = d, = 


1 
ooa S m 


Writing a, = 1 — 1/(n + 1), we see that as n increases the number 
a, will approach the number 1, in the sense that if we mark off any 
interval about the point 1 all the numbers a, following a certain ay 
must fall in that interval. We write 


lima, = 1. 
The sequence = 
_ nt 
” wWPtnti 


behaves in a similar way. This sequence also tends to a limit as n 
increases, in fact to the limit one; lima, = 1. We see this most 
readily if we write wae 


_ n+2 _ 
nr+tnt 1 


a, = — Frs 


we need only show that the numbers r, tend to zero as n increases. For 


all values of n greater than 2 we haven + 2 < 2nandn? +n +1 >r. 
Hence for the remainder r,, we have 


2n 


2 


0<r,< (n > 2), 


= 


from which we see that r,, tends to zero as n increases. Our discussion 
at the same time gives an estimate of the largest amount by which the 
number a, (for n > 2) can differ from the limit one; this difference 
cannot exceed 2/n. 

This example illustrates the fact, that for large values of n the terms 
with the highest exponents in the numerator and denominator of the 
fraction for a, predominate and determine the limit. 
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Let p be any fixed positive number. We consider the sequence 
Ais Ar, Az, e.e, Aps... Where 


ER 
We assert that , l 
lima, = lim ~/p = 1. 
n> n> oO 


We shall prove this by using a lemma that we shall also find useful 
for other purposes. 


Lemma. [fh is a positive number and n a positive integer, then 
(1) (lL +A)" >1 + nh. 


This inequality is a trivial consequence of the binomial theorem 
(see p. 59) according to which 


(+ hy =n t Daeg ph, 


if we observe that all terms in the expansion of (1 + 4)” are non- 
negative. The same argument yields the stronger inequality 


(+ A> 1+ nh + Vp 


Returning to our sequence, we distinguish between the cases p > 1 
andp < l(ifp = 1, then Wp is equal to | for every n, and our statement 
is certainly true). 

If p> 1, then Wp also is greater than 1; we set Vp = 1 + h, 
where h,„ is a positive quantity depending on n; by the inequality (1) 
we have 

p= +h) 21 +nh, 
implying 


ioe E 
n 


As n increases the number /,, must tend to 0, which proves that a, 
converges to the limit one, as stated. At the same time we have a 
means for estimating how close any a, is to the limit one, since the 
difference h, between a, and one is not greater than (p — 1)/n. 


Ifp < 1, then 1/p > land Wp converges to the limit one. However, 
al 

W1/p i 
As the reciprocal of a quantity tending to one Vp itself tends to one. 


47 
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e. a, = a” 
We consider the sequence a, = «”, where « is fixed and n runs through 
the sequence of positive integers. 
First, let « be a positive number less than one. We then put 
a = 1/(1 + A), where A is positive, and the inequality (1) gives 
1 1 I 


= ——_ < < 
+h" 1+nh nh 


Since h, and consequently 1/h, depends only on « and does not change 
as n increases, we see that «” tends to zero as n increases: 


an 


lma” =0 (0<a«a<1). 
The same relationship holds when « is zero, or negative but greater 
than —1. This is immediately obvious, since in any case lim |a|" = 0. 


n—= o 

If «x = J, then «” always is equal to one and we shall have to regard 
the number one as the limit of a”. 

If « > 1, we put x = 1 + h, where A is positive, and at once see 
from our inequality that as n increases «” does not tend to any definite 
limit, but increases beyond all bounds. We say that a” tends to infinity 
as n increases or that «” becomes infinite; in symbols, 

lima” =o (a> 1). 
We explicitly emphasize that the symbol œ does not denote a number 
and that we cannot calculate with it according to the usual rules; state- 
ments asserting that a quantity is or becomes infinite never have the 
same sense as an assertion involving definite quantities. In spite of 
this, such modes of expression and the use of the symbol œ are 
extremely convenient, as we shall often see in the following pages. 

If « = — 1, the value of «” does not tend to any limit, but as n runs 
through the sequence of positive integers «” takes the values +1 and 
—1 alternately. Similarly, if «< —1 the value of a” increases 
numerically beyond all bounds, but its sign is alternately positive and 
negative. 


f. Geometrical Illustration of the Limits of x” and Wp 


If we consider the graphs of the functions y = x” and y = x!” = Wx 
and restrict ourselves for the sake of convenience to nonnegative values 
of x, the preceding limits are illustrated by Figs. 1.44 and 1.45 respec- 
tively. We see that in the interval from 0 to 1 the curves y = x” come 
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y 


Yar? 


Ps 


Figure 1.44 x” as n increases. 


closer and closer to the x-axis as n increases, whereas outside that 


interval they climb more and more steeply and approach a line parallel 
to the y-axis. All the curves pass through the point with coordinates 


x = 1, y = 1 and the origin. | 
The graphs of the functions y = x!” = +x, come closer and 


closer to the line parallel to the x-axis and at a distance | above it; 
again all the curves must pass through the origin and the point (1, 1). 
Hence in the limit the curves approach the broken line consisting of 
the part of the y-axis between the points y = 0 and y = 1 and of the 
parallel to the z-axis y = 1. Moreover, it is clear that the two figures 
are closely related, as one would expect from the fact that the functions 


y = Wx are the inverse functions of the nth powers, from which we 
infer that for each n the graph of y = x" is transformed into that of 


y = Wz by reflection in the line y = zx. 
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Figure 1.45 2!/" as n increases. 


g. The Geometric Series 


An example of a limit familiar from elementary mathematics 1s 
furnished by the geometric series 


Lape gh ag Ss 


the number g is called the common ratio or quotient of the series. The 
value of this sum may, as is well known, be expressed in the form 


n 


27 

I—q 

provided that q Æ 1; we can derive this expression by multiplying the 

sum S, by q and subtracting the equation thus obtained from the 
original equation or we may verify the formula by division. 

What becomes of the sum S, when n increases indefinitely? The 

answer is: The sequence of sums S, has a definite limit S if q lies 
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between —1 and +1, these end values being excluded, and 


In order to verify this statement we write S, as (1 — q")/(1 — q) 
= 1/(1 —g) —q"/(1 — q). We have already shown that provided 
lg] < 1 the quantity q” tends to zero as n increases; hence under this 
assumption q”/(1 — q) also tends to zero and S,, tends to the limit 
1/(1 — q) as n increases. 

The passage to the limit lim (1 +9 +9? +°: +g) = 1/ — q) 


is usually expressed by saying that when |q| < 1 the sum of the infinite 
geometric series is the expression 1|(1 — q). 

The sums S, of the finite geometric series are also called the partial 
sums of the infinite geometric series 1+ 9+q?+.... (We must 
draw a distinction between the sequence of numbers q” and the partial 
sums of the geometric series.) 

The fact that the partial sums S,, of the geometric series tend to the 
limit S = 1/(1 — q) as n increases is also expressed by saying that 
the infinite geometric series 1 + q + q? + -> converges to the sum 
S = 1/(1 — q) when |g| < 1. 

In passing it should be noted if q is rational, for example, q = 3 or 
q = 4, then the sum of the infinite geometric series has a rational value 
(in the cases mentioned the values are 2 and 3, respectively). This 
observation is behind the well-known fact that periodic decimal fractions 
always represent rational numbers.’ The general proof of this fact will 
be clear from the example of the number 


x = 0.343434--- 


which can be evaluated by writing 


1 1 


ee nemm D 


~ 1001 — 1/100 99` 


1 See Courant and Robbins, What Is Mathematics ?, p. 66. 
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ha,= WV n 
We show that the sequence of numbers 
a,= 1, a, = V2, a, = Ñ®3,..., a,=Wn,... 
tends to 1 as n increases: 


lim yn = 1. 


n— 0 


Since a, exceeds the value 1, we set a, = 1+h,, with h, positive. 
Then (see p. 64) 


n=(a,)"=(1 +h,)" 


> 1+ nh + p> a ht 
It follows for n > 1 that 
eee 
4 Esi 
hence = 
h, < v2 
da ~n—! 


We now have 


2 
t<a,=1+h,< 1+ l 
Jn-i 
The right-hand side of this inequality obviously tends to one, and 
therefore so does a,. 


t.a,=Vn+1—Vn 


In this example the a,, are differences of two terms, each of which 
increases beyond all bounds. Attempting to pass to the limit separately 
with each of the two terms, we obtain the meaningless symbolic 
expression œ — oo. In such a case the existence of a limit and what 
its value may be depends completely on the special case. We assert 
that in our example 

lim (y/n + 1 — yn) = 0. 
nN W 


For the proof we need only write the expression in the form 


= pp Wn ti — Jon ttt Jn) _ 1 . 
See naa ian 


and see at once that it tends to zero as n increases. 


70 Introduction Ch. 1 


n 
je a, =—, fora > | 
xX 


Formally, the limit of the a, is of the indeterminate type 00/00 already 
encountered in Example c. We assert that in this example the sequence 
of numbers a, = n/a” tends to the limit zero. 

For the proof we put « = 1 + h, where h > 0, and again make use 
of the inequality 


OASI + nht MO pty MAD e, 
Hence for n > 1 
n 2 
= ——— < M 
(+h) (n— 1)k? 


An 


Since a, is positive and the right-hand side of this inequality tends to 
zero, a„ Must also tend to zero. 


1.7 Discussion of the Concept of Limit 


a. Definition of Convergence and Divergence 


From the examples discussed in Section 1.6 we abstract the following 
general concept of limit: 


Suppose that for a given infinite sequence of points a, Az, Q3,... there 
is a number | such that every open interval, no matter how small, marked 
off about the point l, contains all the points a, except for at most a 
finite number. The number | is then called the limit of the sequence 


Q,,4,..., or we say that the sequence aj, a@.,... is convergent and 
converges to l; in symbols, lim a, = l. 
n—> 0 


The following definition of limit is equivalent: 


To any positive number «, no matter how small, we can assign a 
sufficiently large integer N = N(e) such that from the index N onward 
[that is, for n > N(«)] we always have |a, — I| < €. 


Of course, it is true as a rule that M(e) will have to be chosen larger 
and larger for smaller and smaller values of the tolerance e; in other 
words, NM(e) will usually increase beyond all bounds as e tends to zero. 
The vague intuitive notion of limit suggests a picture of the a„ moving 
closer and closer to l. This picture is replaced here by the precise “static” 
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definition: Any neighborhood of / contains all a, with at most a finite 
number of exceptions.! 

Obviously, a sequence a, a, . . . cannot have more than one limit /. If on 
the contrary two distinct numbers / and /’ were limits of the same sequence 
A1, Az, . . . , we could mark off open intervals about each of the points / and l’ 
which do not overlap. Since each interval contains all but a finite number of 
the a, the sequence could not be infinite. The limit of a convergent sequence 
is therefore uniquely determined. 

Another obvious but useful remark is: If from a convergent sequence we 
omit any number of terms the resulting sequence converges to the same limit 
as the original sequence. 


A sequence which does not converge is said to be divergent. If as n 
increases the numbers a, increase beyond all positive bounds, we say 
that the sequence diverges to + 00; as we have already done occasionally, 
we write then lima, = œ. Similarly, we write lima, = — œ if, as n 


n— © n— © 


increases, the numbers —a,, increase beyond all bounds in the positive 
direction. But divergence may manifest itself in other ways, as for the 
sequence a, = —1, a, = +1, a, = — l, ag = +1,..., whose terms 
swing back and forth between two different values. 

Clearly, neither divergence nor convergence of a sequence is affected 
by removing finitely many terms. 

A sequence aj, dy, . . . is bounded if there is a finite interval containing 
all points of the sequence. Any finite interval is contained in some 
finite interval that has the origin as center. Hence the requirement 
that the sequence is bounded means that there exists a number M such 
that |a,,| < M for all n. 

A convergent sequence A, âz, ... necessarily is also bounded. For 
let / be the limit of the sequence. Taking « = 1 we find from the 
definition of convergence that all a, from a certain N onward lie in the 
interval of length 2 centered at /. The only terms a, of the sequence that 
may lie outside that interval are a,,...,a@,_,. We can then, however, 
find a larger finite interval that also includes a,,..., @y_4. 


b. Rational Operations with Limits 


From the definition of limit it follows at once that we can perform 
the elementary operations of addition, multiplication, subtraction, and 
division of limits according to the following rules. 


1 The reader will notice the analogy with the definition of continuity of afunction f (x) 
at a point 2». The role played there by the sufficiently small quantity d(€) is played 
here by the sufficiently large integer N(e). We shall see indeed on p. 82 that con- 
tinuity of a function at a point can be formulated in terms of limits of sequences. 
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If a,, a, . . . is a sequence with the limit a and b,, be, . . . is a sequence 
with the limit b, then the sequence of numbers c, = a, + b, also has a 
limit c, and 

c=limc, =a +b. 


n=O 


The sequence of numbers c, = a,b, likewise converges and 


lim c, = ab. 


n= 00 


Similarly, the sequence c, = a, — b, converges and 


lim c, = a — b. 

n= 
Provided the limit b differs from zero, the numbers c, = a„/b,„ like- 
wise converge and have the limit 


In words: We can interchange the rational operations of calculation 
with the process of forming the limit; we obtain the same result 
whether we first perform a passage to the limit and then a rational 
operation or vice versa. 

The proofs of all these rules become clear if one of them is carried out. 
We consider the multiplication of limits. If the relations a, —>a and 
b,, —> b hold, then for any positive number e, we can insure both 


jla — a| <€ and |b—b,| <€ 
by choosing n sufficiently large, say n > N(e). If we write 
ab — a,b, = b(a — a,) + a(b — 5,) 


and recall that there is a positive bound M, independent of n, such that 
la,| < M, we obtain 


lab — a„b,„| < Ibl la — a„| + lanl |b — bn) < (| + M)e. 


Since the quantity (|b] + M)e can be made arbitrarily small by choosing 
€ small enough, the difference between ab and a,b, actually becomes 
as small as we please for all sufficiently large values of n; this is 
precisely the statement made in the equation 


ab =\ima,b,,. 


n> oO 
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Using this example as a model, the reader can prove the rules for 
the remaining rational operations. 

By means of these rules many limits can be evaluated easily; thus, 
we have 


1 
LaS 
n? — 1 : n? 
ar E 
n>œ N ae 4 
n n 


since in the second expression we can pass directly to the limit in the 
numerator and denominator. 

The following simple rule is frequently useful: Zf lim a„ = a and 
lim b„, = b, and if in addition a„ > b, for every n, then a > b. We are, 
however, by no means entitled to expect that a will always be greater 
than b, as is shown by the sequences a, = 1/n, b, = 1/2n, for which 
a=b=0. 


c. Intrinsic Convergence Tests. Monotone Sequences 


In all the examples given the limit of the sequence considered was a 
known number. In fact, to apply the above definition of limit of a se- 
quence we must know the limit before we can verify convergence. If the 
concept of limit of a sequence yielded nothing more than the recognition 
that some known numbers can be approximated by certain sequences 
of other known numbers, we should have gained very little from it. 
The advantage of the concept of limit in analysis lies essentially on the 
fact that important problems often have numerical solutions which may 
not otherwise be directly known or expressible, but can be described 
as limits. The whole of higher analysis consists of a succession of 
examples of this fact which will become steadily clearer in the following 
chapters. The representation of the irrational numbers as limits of 
rational numbers may be regarded as the first and typical example. 

Any convergent sequence of known numbers a}, az, ... defines a 
number /, its limit. However, the only test for convergence that arises 
from the definition of convergence consists in estimating the differences 
ja, — /|, and this is applicable only if the number / is known already. 
It is essential to have “‘intrinsic” tests for convergence that do not 
require an a priori knowledge of the value of the limit but only involve 
the terms of the sequence themselves. The simplest such test applies 
to a special class of sequences, the monotone sequences, and includes 
most of the important examples. 
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Limits of Monotone Sequences 


A sequence 4), a, ... is called monotonically increasing if each term 
a,, is larger, or at least not smaller than the preceding one; that is, 


a, = a,_1- 


Similarly, the sequence is monotonically decreasing if a, <a,_, for 
all n. A monotone sequence is one that is either monotonically increas- 
ing or decreasing. With this definition we have the basic principle: 


A sequence that is both monotone and bounded converges} 


This principle is convincingly suggested, but not proved, by intuition; 
it is intimately related to the properties of real numbers and in fact is 
equivalent to the continuity axiom for real numbers. 

The axiom (see Section 1b) that every nested sequence of intervals 
contains a point is easily seen to be a consequence of the convergence 
of bounded monotone sequences. For let [a,, b4], [a., b2], ... be a 
sequence of nested intervals. By the definition of nested sequences we 
have 

aa, 0°°'S$a,<6b, 56,15 °°' <5). 


Obviously, the infinite sequence a4, a, . . . is monotonically increasing. 
It is also bounded since a, < a, < b, for all n. Hence / = lima, 


n-r © 


exists. Moreover, for any m and for any number n > m we have 


am Sn S bm 
Hence also 
a, Slima, = L< bn. 
Thus all intervals of the nested sequence contain one and the same 
point /. (That they have no other point in common follows from the 
further property lim (5, — a,) = 0 of nested sequences of intervals.) 


Cauchy’s Criteria for Convergence 


A convergent sequence is automatically bounded but need not be 
monotone (see Example b, p. 62). Hence, in dealing with general 
sequences, it is desirable to have a test for convergence that is also 


1 The assumption of boundedness is essential since no unbounded sequence can 
converge. Oberve that a monotonically increasing sequence a,,a,,... is always 
‘bounded from below”: a, > a, for ali n. In order to prove that a monotonically 
increasing sequence converges it is sufficient then to find a number M such that 
a, < M for all n. 
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applicable to nonmonotone sequences. This need is satisfied by a 
simple condition, the Cauchy test for convergence; this criterion 
characterizes sequences of real numbers which have a limit; most 
importantly it does not require a priori knowledge of the value of the 
limit: Necessary and sufficient for convergence of a Sequence ay, d,... 
is that the elements a,, of the sequence with sufficiently large index n 
differ arbitrarily little from each other. Formulated precisely: a 
sequence a, is convergent if for every e > 0 there exists a natural 
number N = M(e) such that ja, — am| < € whenever n > Nandm > N. 
Geometrically, the Cauchy condition states that a sequence converges 
if there exist arbitrarily small intervals outside of which there lie only 
a finite number of points of the sequence. The correctness of Cauchy’s 
test for convergence will be proved and its significance discussed in the 
Supplement. 


d. Infinite Series and the Summation Symbol 


A sequence is just an ordered infinite array of numbers a}, a,,.... 
An infinite series 
a4 +a, + 43+ °°: 


requires the terms to be added in the order in which they appear. To 
arrive at a precise meaning of the sum of an infinite series we consider 
the nth partial sum that is, the sum of the first n terms of the series 
S, = ta tte + 4,,. 
The partial sums s,, for different n form a sequence 
5) = 4), S2 = a, + Ap, S3 = A, + a, + Qs, 
and so on. The sum s of the infinite series is then defined as 


s=lims,, 
n> oO 


provided this limit exists. In that case we call the infinite series con- 
vergent. If the sequence s,, diverges, the infinite series is called divergent; 
For example, the sequence 1, q, g’, q°,... gives rise to the infinite 
geometric series 


hg eg ag a 


whose partial sums are 


Sp=l+qtq@tetq™ 
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For |g| < 1 the sequence s, converges toward the limit 


1 
s = —, 
1—q 
which then represents the sum of the infinite series. For |q| > 1 the 
partial sums s, have no limit and the series diverges (see p. 67). 
It is customary to use for a, + ag + ' +: + a, the symbol 

Èa, 

k=1 
which indicates that the sum of the a, is to be taken with k running 
through the integers from k = 1 to k = n. For example, 

4 1 A l 


> — stands for — + — + Al? 


1 
Sk! tatz T 


whereas 
> a*b** stands for atb? + abt + abo + +++ + arb”. 
k=1 


More generally, > a, means the sum of all a, obtained by giving k the 


k=m 
values m, m+1, m+2,...,n. Thus 
5 1 1 l l 
2a atats 


In these examples we have used the letter k for the index of sum- 
mation. Of course, the sum is independent of the letter denoting this 
index. Thus 


k=1 i=1 

We use the symbol 

[e @) 

È a; 

k=1 2 
to denote the sum of the whole infinite series. Similarly, $, a, would 

k=0 

stand for the sum of the infinite series a, + a, + a, + ..., whose nth 
partial sum is 5, = ao + a, + a, +°°* + 4,4. 


Many of our earlier results can be written more concisely in this 
summation notation. The formula of p. 58, for the sum of the first 
n squares becomes 

y= n(n + 1)(2n + 1) 
k=l 6 
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The formula for the sum of a geometric series is 


o0 


1 
2r =r; for |q| < 1. 


k=0 1 
Finally, the binomial theorem is expressed by 
(a + b) = > (7 Jaro 
r=0 \k 


Since an infinite series is merely the limit of a sequence s,,, convergence 
can be decided on the basis of the convergence tests for sequences. 
For example, the convergence of the series 


follows immediately from the fact that the partial sums 


— 1 1 1 1 1 
G a —+— a ee pea, 
Dr: pit > 33 t ee 
increase monotonically with n and are bounded since 
1 1 1 1 
lS nstt atata t ts 
1 1 — 1/2" 1 73 
E a E 
4 1-3 2 2 


Later, in Chapter 7, we shall study infinite series more system- 
atically. 


e. The Number e 


As a first example of a number which is generated as the limit of a 
sequence, we consider 


1 1 
TE EE iv a aS 
Thus e stands for lim S,, where 
=l ra to RE RR iene 
pri 2 n! 


! Remembering the convention defining 0! as 1, we can write the first term of the 
series as 1/0! in agreement with the law of formation of the following terms. Notice 
that in our notation S, is really the (n + 1)st partial sum of the infinite series, 
instead of the nth. This is, however, of no significance. 
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The numbers e and v are the most widely used transcendental constants 
in mathematical analysis. In order to prove the existence of the limit e 
we need only prove that the sequence S,, is bounded since the numbers 
S„ increase monotonically. For all values of n we have 


bai 1 1 
AES E z: 
Te Ra a eae n 
i 11 
11+e+—4+—4:°: 
TE US ag ee Fe 
1 — 1/2" 
=] <3. 
DE 


The numbers S,, therefore have the upper bound 3, and since they 
form a monotonic increasing sequence, they possess a limit which we 
denote by e. 

The expression for e as a series permits us to compute e rapidly 
with great accuracy. The error committed in approximating e by a 
partial sum S,, can be estimated by the same method of comparison 
with a geometric series that furnished the upper bound 3 for e. We 


have for any n > m 
S =5 E EE E 
”  ™ (m+1)! (m+2)! n! 


1 1 1 
s tea ao ne Gea) | 


1 1 1 
A TN L t 
á fa Pnl (m + 1)? 
1 1 1 1 


ee ee Oe ee 
(m+ 1), 1 mm! 


Hence for n > m 


Letting n increase beyond all bounds while holding m fixed we find 
also that i 
S,.<¢€< 8, +--— =: 

mm! 
Hence e differs from S,, by at most (1/m)(1/m!). Since m! increases 
extremely rapidly with m, the number S,, is a good approximation for 
e already for fairly small m; for example, Sj, differs from e by less 
than 10-7. In this way we find that e = 2.718281 ---. 
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e is an irrational number. The estimate for e in terms of S,, can also 
be used to establish this fact. Indeed, if e were rational, we could 
write e in the form p/m with positive integers p, m; here, m > 2, since 
e, lying between 2 and 3, cannot be an integer. Comparing e with the 
partial sum S,,, we would have 

1 1 


See Se, 
m mm! 


If we here multiply both sides by m!, we find that 


mS. = Bon Dims eS mi eT 
m 
But 
m!S, =m!+m!4+—+—4::°-4— 


is an integer since each term in the sum is. Thus, if e were rational, the 
integer p(m — 1)! would lie between two successive integers, which is 
impossible.’ 

e As Limit of (1 + 1/n)". The number e that was defined here as 
the sum of an infinite series can also be obtained as the limit of the 


sequence 
re (i+tf 
n 


The proof is simple and at the same time an instructive example of 
operations with limits. According to the binomial theorem, 


r,=(1 ++} 
n 


l s ~1)(n—2)---1 1 
a a e 
n 2! n n! n” 


1 1 
=] +1 I-i) 
uT n 


+4(1-A)(1-2)-- (1-24) 

n! n. n n 

1 The irrationality of the number e means that there is no linear equation ax + b = 0 
with rational coefficients a, b and a # 0 having e as a solution. A much stronger 
statement has been proved (by Hermite), that there exists no polynomial equation 
agx” + az" + +++ + a£ + a, = 0 of any degree n whatsoever and with rational 
coefficients ao, 41, . . . , An (with ay ¥ 0) with xz = e as a root. One says that e isa 
transcendental number in contrast to “algebraic” numbers like V2 or “10 that are 
roots of certain polynomial equations with rational coefficients. 


=l1+n 
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From this we see at once that T, < S, <3. Furthermore, since we 
obtain 7,,,, from T, by replacing the factors 1 — 1/n, 1 — 2/n,... by 
the larger factors 1 — 1/(n + 1), 1 — 2/(n + 1),... and finally adding 
a positive term we see that the T,,’s also form a monotonic increasing 
sequence, from which the existence of the limit lim T, = T follows. 


n-—> © 


To prove that T = e, we observe that for m > n 


Ty >t+1+t(i—A)ee-+4(r—4)..(1- 224) 


m n! m m 


If we now keep n fixed and let m increase beyond all bounds, we obtain 
on the left the number 7 and on the right the expression S,, so that 
T > S, Thus T > S„, > T, for every value of n. We now let n 
increase, so that 7, tends to T; from the double inequality it follows 
that T = lim S, = e. This was the statement to be proved. 


We shall later (Section 2.6, p. 149) be led to this number e again from 
still another point of view. 


f. The Number r as a Limit 


A limiting process which in essence goes back to classical antiquity 
(Archimedes) is that by which the number ~r is defined. Geometrically, 
m means the area of the circle of radius one. We regard it as obvious 
that this area can be expressed by a (rational or irrational) number, 
denoted by 7. However, this definition is not of much help to us if we 
wish to calculate the number with any accuracy. We then have no 
choice but to represent the number by means of a limiting process, 
namely, as the limit of a sequence of known and easily calculated 
numbers. Archimedes already used this process in his method of 
exhaustion, which consists of approximating the circle by means of 
regular polygons with an increasing number of sides fitting it more and 
more closely. If we let fm denote the area of the regular m-gon (polygon 
of m sides) inscribed in the circle, the area of the inscribed 2m-gon is 
given by the formula [proved by elementary geometry or from the 
expression f, = (n/2) sin (277/n) (see Fig. 1.46)] 


ics a 2 — W1 — (2f,,/m)’. 


We now let m range, not through the sequence of all positive integers but 
through the sequence of powers of 2, that is, m = 2”; in other words, 
we form those regular polygons whose vertices are obtained by repeated 
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bisection of the circumference. It is clear from the geometric interpre- 
tation that the f,, form an increasing and bounded sequence and thus 
have a limit which is the area of the circle: 


m = lim fy. 
n= œ 

This representation of m as a limit serves actually as a basis for 
numerical computations; for, starting with the value f} = 2, we can 
calculate in order the terms of our 
sequence tending to m. An estimate 
of the accuracy with which any term 
Jon represents 7 can be obtained by 
constructing the lines touching the 
circle and parallel to the sides of the 
inscribed 2”-gon. These lines form a 
circumscribed polygon similar to the 
inscribed 2"-gon and having larger 
dimensions in the ratio 1 :cos (7/2"). 
Hence the area F, of the circum- 
scribed polygon may be found from 
the ratio given by 


2 
h = (cos z) : Figure 1.46 
9” 


Since the area of the circumscribed polygon is greater than that of the 
circle, we have 


n 2 n 
fa <a < Fp = = nml 
(cos z) ENET 


For example, /, = 2V2, so that we have the estimate 


- 4/2 
2/2 <r < TEN 

These are matters with which the reader will be more or less familiar. 
What we wish to point out, however, is that the calculation of areas 
by means of exhaustion by rectilinear figures whose areas can be 
calculated easily forms the basis for the concept of integral, to be 
introduced in Chapter 2. For the actual numerical computation of 
a much more efficient methods are available, as we shall see in 
Section 6.26. 
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1.8 The Concept of Limit for Functions of a Continuous Variable 


Hitherto we have considered limits of sequences, that is, of functions 
of an integral variable n. The notion of limit, however, frequently 
occurs in connection with a function f(x) that is defined for all x in 
some interval. 

We say that the value of the function f(x) tends to a limit 7 as x 
tends to &, or in symbols, 


lim f(x) =n 
x § 


if f(x) differs arbitrarily little from 7 for all x for which f(x) is defined 
and which lie sufficiently near to &.! Expressed more precisely the 
definition of lim f (x) is as follows. 

Whenever an arbitrary positive quantity «e is assigned, we can mark 
off an interval |r — | < ó so small that for any x which belongs both 
to the domain of f and to that interval the inequality | f (x) — 7| < € 
holds, then lim f(x) = 7. 


There is a close connection between the concepts of limit of a function 
and continuity. If & belongs to the domain of f, that is, if /(&) is 
defined, then lim f(x), if it exists at all, must have the value /(é). 


g >E 
Indeed, the definition of 7 = lim f(x)implies in particular | f(E) — n| < € 


for every positive « and hence 4 = {(&). Now, comparing the definitions 
of limit and of continuity, we see that the relation lim f(x) = /(€) 


x- E 
just expresses the continuity of the function fat the point &. Hence for & 
in the domain of f the existence of lim f(x) just signifies that f is con- 


tinuous at $. More generally, R is not defined at & but lim f(x) 


exists and has the value 7, we can assign to fat the point ¢ the value n 
and the function f, thus completed, will be continuous at £. (Removable 
Singularity. See p. 35.) 


The limit of a function can also be described completely in terms of limits 
of sequences. The statement 


lim f(x) = 7 
xg 
means that 
lim f(t,) = 7 
n—> W 


for every sequence x, with limit ¢ (where it is assumed, of course, that the x, 
belong to the domain of f). For if m f@) = 7 and if lim x, = å, then f(x) 


n—> © 


1 It is assumed here that arbitrarily close to & there are points where f is defined. 
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is arbitrarily close to 7 for x sufficiently close to $; but x„ is sufficiently close 
to & if only n is large enough, and consequently, lim f(z,) = 7. If, on the 


n —> @ 
other hand, lim f(x„) = n whenever x, > , we must have lim f(x) = ņ. 


n—> wW g- 


. e,e "§ 
Otherwise there would exist a positive e such that | f(x) — n| > for some 
x arbitrarily close to ; there would then also exist a sequence x, converging 
to $ for which | f(,) — n| > e, but then lim f(2,,) could not be n. 


Continuity of the function f(x) at the point ¢ implies then: lim f(x,) = /(§), 
n-> © 
for every sequence x, in the domain of f that converges to . More generally, 


for a function continuous in an interval the relation 
lim f(~,,) = f( lim Xn) 
n —> 0c n— œ 
is valid for any sequence in the domain of f which converges to a point of the 
interval. We see that for a continuous function the limit symbol can be 
interchanged (or, as one says, ““commutes”’) with the symbols for the function. 


Limits of sums, products, and quotients of functions are found by 
the same rules as for sequences (see p. 71): If lim f(z) = 7 and 
lim g(x) = ¢, exist, then as 


O im + g@) =H tS lim ga) = nd 


and for č Æ 0 also 
Hx) n 


lim —— = -. 

z= g(x) ¢ 
The proofs are the same as for sequences. (The rules would also 
follow from those for sequences by writing limits of functions as 
limits of sequences.) Consequently, when & belongs to the domain of 
f and g, the sum, product, and quotient of two functions f(x) and g(x) 
which are continuous at a point & are again continuous (where for quo- 
tients wè have to assume that g(&) # 0). 

The cases where & does not belong to the domain of f will turn out to 

be of particular importance for differential calculus. As a first example 
we consider the relation 


a ok 
lim ————— 
x>g XL 


= në”? l 


for n a positive integer. Of course, f(x) = (x” — ”")(x — &) is a 
function defined only for x ~ ¢. But for x Æ & the algebraic identity 
x” =% E” 


= a oe a al Ss 
xz—6& 
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is valid as a consequence of the summation formula for the geometric 
series. To find the limit we only have to let x tend to £ and to evaluate 
the limit of the right-hand side by the rules for limits of sums and 
quotients. 

Less obvious is the formula 


lim = 1 

270 X 
(where, of course, the angle x is measured in “radians,” as explained 
on p. 50). Again the quotient (sin x)/z is defined only for x # 0. 


Figure 1.47 


But, if we define (sin x)/x = 1 for x = 0 we complete the quotient as 
a function which is continuous also at x = 0. For the proof of the 
limit formula we appeal here to a geometric argument. 

From Fig. 1.47 we find by comparing the areas of the triangles OAB 
and OAC and the sector OAB! of the unit circle that if 0 < x < 7/2 


singz < 4x < 4 tanz. 
From this it follows that if 0 < |z| < 7/2, 


x 1 


sing  cosz 


Hence the quotient (sin x)/x lies between the numbers 1 and cos x. We 
know that cos x tends to 1 as x — 0, and from this it follows that the 
quotient (sin x)/x can differ only arbitrarily little from 1, provided that 


1 Of course, we could have defined the angle x in the first place as twice the area of 
sector OAB. 
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x is near enough to 0. This is exactly what is meant by the equation 
which was to be proved. 
From the result just proved it follows that 


. tanz . Sinz.. 1 
lim = lim lim = |, 
z70 T£ 27-0 XY r-o cost 
and also 
. |—cosz 
lim ————— = 0. 
x70 x 


This last follows from the formula, valid for 0 < |z| < 7/2, 


1 — cosx (1 —cosz)(1 + cos zx) 1 — cos? x 


a a _— 


a x(1 + cos x) 7 x(1 + cos x) 
sin x 1 ; 
A A 
x 1+ cosz 
For x — 0 the first factor on the right tends to 1, the second to 4, and the 
third to 0; the product therefore tends to 0, as was stated. 
Dividing the same formula by z, we obtain 


1—cosx _ (= z) 1 


5 
x x 1+ cos x 


from which 


Limits for x — œ. Finally we remark that it is just as well possible 
to consider limiting processes in which the continuous variable x 
increases beyond all bounds. For example, the meaning of the equation 


2 2 
lim ziti = lim 1 + 1/2 =] 
aan t — | z»2æœl — 1/x? 


is clear. It signifies that the function on the left differs arbitrarily little 
from one, provided only that x is sufficiently large. The rules for 
forming the limits of this kind for sums, products, and quotients are 
the same as before. 

* There is one further result which is frequently useful in the calculation 
of limits, the rule for obtaining the limit of a compound function. 
The compound function f(g(z)) is defined for those values of z for which 
x = g(z) lies in the domain of f(x). The function g(z) may be a function 
of a continuous variable or an integer variable, but f(x) must be a 
function of a continuous variable. 
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If cw g(z) = € where £ lies within an open interval of the domain of 
f and i F f(x) = n, then lim f(g(z)) = 7. As a corollary we observe 


that a sontintigus function of a continuous function is itself continuous 
(as already mentioned on p. 55). 

The result is obvious from the fact that we can make f(x) arbitrarily 
close to 7 by taking x sufficiently close to & and to make x = g(z) close 
enough to ¢ we have only to take z sufficiently close to ¢. With slight 
modifications, the same statements apply when any of the variables is 
allowed to increase beyond all bounds. 


a. Some Remarks about the Elementary Functions 


So far we tacitly assumed that the elementary functions are con- 
tinuous. The proof of this fact is very simple. First, the function 
/(z) =x is continuous; therefore x? = x:x is continuous, as the 
product of two continuous functions, and every power of x is likewise 
continuous. Thus every polynomial is continuous, being the sum of 
continuous functions. Every rational function, as a quotient of con- 
tinuous functions, is likewise continuous in every interval in which the 
denominator does not vanish. 

The function x” is continuous and monotonic for x > 0. Hence the 
nth root, being the inverse function of the nth power, is continuous. 
From this fact it is easy to conclude that the nth root of a rational 
function is continuous (except where the denominator vanishes). 

The continuity of the trigonometric functions could now be proved, 
using the concepts already developed. However, we omit the dis- 
cussion here, since in Chapter 2 (p. 166), the continuity of all these 
functions will be seen to follow simply as a consequence of their 
differentiability. 


We merely make a few remarks about the definition and continuity of the 
exponential function a”, the general power function x?, and the logarithm. 
e assume, as in Section 1.3 (p. 51), that a is a positive number, say greater 
than one, and r = p/q is a positive rational number (p and q being integers); 
then a’ = a?/ is the positive number whose gth power is a”. If « is any 
irrational number and ry, rg-**Frp,... iS a sequence of rational numbers 


approaching «, we assert that lim a’ exists; we then call this limit a7. 
m+ oO 
In order to prove the existence of this limit by Cauchy’s test, we need show 


only that |a"» — a”™| is arbitrary small, provided that n and m are sufficiently 
large. We suppose, for example, that r, > rm, or that r, — rp, = ô, where 
ô > 0. Then 

a'n — am = qam(a? — 1). 
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Since the r,, converge to «, they are bounded and so are the a’; thus it 
suffices to shew that 
la? — 1| =a’ — 1 


is arbitrarily small when the values of n and m are sufficiently large. However, 
the rational number 6 certainly may be made as small as we please provided 
the values of n and m are sufficiently large. Hence if / is an arbitrarily large 
positive integer, ô < 1// if n and m are large enough. Now the relations 
ô < l/landa > 1 give! 

1 <a? <a, 


and since a’! tends to one as / increases to infinity (cf. p. 64), our assertion 
follows immediately. 

It can be shown that the function a?” extended to irrational values in this 
way is also continuous everywhere, and, moreover, that it is monotonic. 
For negative values of x this function is naturally defined by the equation 


l 


ať = — 
qu 

As x runs from —« to +œ, a? takes all values between zero and +œ. 
Consequently, it possesses a continuous and monotonic inverse function, 
which we call the /ogarithm to the base a. In like manner we can prove that 
the general power x? is a continuous function of x, where « is any fixed rational 
or irrational number and v varies over the interval 0 < x < œ, and is 
monotonic if « Æ 0. 

The “elementary” discussion of the exponential function, the logarithm, 
and the power x* outlined here will later (p. 149) be replaced by another 
discussion which in principle is much simpler. 


Supplement 


One of the great achievements of Greek mathematics was the 
reduction of mathematical statements and theorems in a logically 
coherent way to a small number of very simple postulates or axioms, 
the well-known axioms of geometry or the rules of arithmetic governing 
relations among a few basic objects, such as integers or geometrical 
points. The basic objects originate as abstractions or idealizations 
from physical reality. The axioms, whether considered as ‘‘evident”’ 
from a philosophical point of view or merely as overwhelmingly 
plausible, are accepted without proof; on them the crystalized structure 
of mathematics rests. For many centuries the axiomatic Euclidean 


1 This statement follows from the fact that for a > 1 the power a™/” is greater than 
one if m/n is positive. For a = (q’/")™ is the product of m factors all greater than 
one, and so is greater than one. 
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mathematics was accepted as a model for mathematical style and even 
imitated for other intellectual endeavors. (For example, philosophers, 
such as Descartes and Spinoza, tried to make their speculations more 
convincing by presenting them axiomatically or, as they said, “more 
geometrico.”’) 

The axiomatic method was discarded when after the stagnation during 
the Middle Ages mathematics in union with natural science started an 
explosively vigorous development based on the new calculus. Ingenious 
pioneers vastly extending the scope of mathematics could not be 
hampered by having to subject the new discoveries to consistent 
logical analysis and thus in the seventeenth century an invocation of 
intuitive evidence became a widely used substitute for deductive proof. 
Mathematicians of first rank operated with the new concepts guided 
by an unerring feeling for the correctness of the results, sometimes 
even with mystical associations as in references to “‘infinitesimals”’ or 
“infinitely small quantities.” Faith in the sweeping power of the new 
manipulations of calculus carried the investigators far along paths 
impossible to travel if subjected to the limitations of complete rigor. 
Only the sure instinct of great masters could guard against gross errors. 

The uncritical but enormously fruitful enthusiasm of the early period 
gradually met with countercurrents which rose to full strength in the 
nineteenth century but did not impede the development of constructive 
analysis initiated earlier. Many of the great mathematicians of the 
nineteenth century, in particular Cauchy and Weierstrass, played a role 
in the effort toward critical reappraisal. The result was not only a 
new and firm foundation of analysis, but also increased lucidity and 
simplicity as a basis for further remarkable progress. 

An important goal was to replace indiscriminate reliance on imprecise 
“intuition” by precise reasoning based on operations with numbers; for 
naive geometric thinking leaves an undesirable margin of vagueness as 
we shall see time and again in the following chapters. For example, 
the general concept of a continuous curve eludes geometrical intuition. 
A continuous curve, representing a continuous function, as defined 
earlier, need not have a definite direction at every point; we can even 
construct continuous functions whose graphs nowhere have a direction, 
or to which no length can be assigned. 

Yet one must never forget that abstract deductive reasoning is 
merely one aspect of mathematics while the driving motivation and the 
great universal scope of analysis stem from physical reality and 
intuitive geometry. 

This supplement will provide a rigorous buttressing (with some 
repetitions) for basic concepts treated intuitively earlier in this chapter. 
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S.1 Limits and the Number Concept 


We start with the ideas of Section 1.1, analyzing fully the concept of 
real number and its connection with that of limit. We define the 
number continuum by a constructive procedure based on the natural 
numbers. We then prove that the extended number concept satisfies 
the rules of arithmetic and the other requirements, making it the 
adequate tool for measurement. 

Since a complete exposition would require a separate book, we 
shall indicate only the main steps. In struggling through the somewhat 
tedious material the student will marvel at the fact that on the basis of 
the natural numbers the human mind could erect a logically consistent 
number system superbly suited to the task of scientific measurement.” 


a. The Rational Numbers 


Limits Defined by Rational Intervals. We begin by accepting the 
system of rational numbers with all its usual properties, derived from 
the basic properties of natural numbers. Thus the rational numbers 
are ordered by magnitude, permitting us to define “rational” intervals 
as sets of rational numbers lying between two given rational numbers 
(intervals including the end points are called closed). The length of the 
interval with end points a, b is |b — a|. As observed in Section la the 
rational numbers are dense and every rational interval contains infinitely 
many rational numbers. For the time being, all quantities occurring 
are assumed to be rational numbers. 

Within the domain of rational numbers we define sequences and 
limits (see p. 70). Given an infinite sequence of rational numbers 


a, Az, . . . and a rational number r we say that 
lima, =r 
n— 0 


1 See for example, E. Landau, Foundations of Analysis, 2nd Ed., Chelsea, New York, 
1960. 

? Real numbers can also be introduced purely axiomatically, with all their basic 
properties accepted as axioms. In the approach we shall take here we accept, in 
principle, only the axioms for natural numbers (including the principle of mathe- 
matical induction). The rational numbers and real numbers are then constructed 
on that basis. The “axioms” for real numbers are then, in principle, merely theorems 
about natural numbers for which proofs are required. Actually, we shall start 
already with the rational numbers as known elements, since the construction of the 
rational from the natural numbers and the derivation of the basic properties of 
rational numbers present no difficulties at all. 
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if every rational interval containing r in its interior also contains 
“almost all” a,, that is, all a, with at most a finite number of exceptions. 
It follows immediately that a sequence of rational numbers cannot have 
more than one rational limit and that the usual rules for limits of sums, 
differences, products, and quotients (see p. 71) are valid for sequences 
of rational numbers with rational limits. 

An entirely obvious consequence of this definition is that passing to 
the limit preserves order: if lim a, = a, limb, = b and for every n, 
a, < bn, then a < b. Note that even assuming a, < b, strictly, we 
cannot say more than a < b, or exclude possible equality of the limits 
(for example, both sequences a, = 1 — 2/n and b, = 1 — I/n> a, 
have the limit 1). 

Statements about limits can be expressed in terms of rational null- 
sequences, that is, sequences a@,, a,,... of rational numbers for which 
lima, = 0. 

no 
One says a,, “becomes arbitrarily small as n tends to infinity,” meaning 
that for any positive rational e, no matter how small, the inequality 
|a,,| < € holds for almost all n. Obviously the sequence a, = 1/n is 
a null-sequence. 

Thus a sequence of rational numbers a, has the rational limit r if 
and only if the numbers r — a, form a null-sequence. 


b. Real Numbers Determined by Nested 
Sequences of Rational Intervals 


We observed on p. 5 that intuitively the rational points are dense 
on the real axis and that there are always rational numbers between 
any two real numbers. This suggests the possibility of rigorously 
defining a real number entirely in terms of order relations with respect 
to the rationals, a procedure we shall now follow. 

A nested sequence of rational intervals (see p. 8) is a sequence of 
closed intervals J, with rational end points a,, b,, with each interval 
contained in the preceding one, whose lengths form a null-sequence 


an-ı < a, < b,, < bni 
and 
lim (b„, — a„) = 0. 
n> © 
Since each interval J, = [a,, b,] of a nested sequence contains all 
succeeding intervals, a rational number r lying outside any J,, also lies 
outside and on the same side of all succeeding intervals. Thus a nested 
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sequence of rational intervals gives rise to a separation of all rational 
numbers into three classes.’ The first class consists of the rational 
numbers r lying to the left of the intervals J„ for sufficiently large n, or 
for which r < a, for almost all n. The second class consists of the 
rational numbers r contained in all intervals J,. This class contains 
at most one number, since the length of the interval J,, shrinks to zero 
with increasing n. The third class consists of the rational numbers r 
for which r > b, for almost all n. It is clear that any number of the 
first class is less than any of the second class, and any number of the 
second class is less than any of the third class. The points a, themselves 
are either in the first or second class, and the numbers b, either in the 
second or third class. 

If the second class is not empty, it consists of a single rational 
number r. In this case the first class consists of the rational numbers less 
than r, the third class of the rational numbers greater than r. We say 
then that the nested sequence of intervals J, represents the rational 
number r. For example, the nested sequence of intervals [r — 1/n, 
r + l/n] represents the number r. 

If the second class is empty, then the nested sequence does not 
_ represent a rational number; these nested sequences then serve to 
represent irrational numbers. The individual intervals [a,, 5,] of the 
sequence are for this purpose unimportant; only the separation of the 
rational numbers into three classes generated by this sequence is 
essential, telling us where the irrational number fits in among the 
rational ones. 

Thus we call two nested sequences of rational intervals [a,, b„] and 
[a,,’, b, ] equivalent if they give rise to the same separation of the 
rational numbers into three classes. The reader should prove as an 
exercise that necessary and sufficient for equivalence is: a," — a, is a 
null-sequence, or also: the inequalities 

Gn Lbr, ay SO, 
hold for all n. 

We assign a real number to a nested sequence of rational intervals 
lan 5,]. The real numbers determined by two different nested sequences 
will be considered to be equal if the sequences are equivalent. A real 
number then is represented by the separation of the rational numbers 
into three classes generated by equivalent nested sequences of rational 
intervals. If the second class consists of a rational number r, we con-. 
sider the real number represented by this separation into classes as 
identical with the rational number r. 


1 A so-called ‘‘Dedekind Cut.” 
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*c. Order, Limits, and Arithmetic Operations for Real Numbers 


Having defined real numbers, we can now define the notions of order, 
sum, difference, product, limit, etc., for them and prove that they have 
the usual properties. To be consistent any definition concerning real 
numbers must: (1) have the ordinary meaning in case the real numbers 
are rational and (2) be independent of the individual nested sequences 
intervals used to represent the real numbers. 


*Intervals with Real End Points 


Although so far, even for the definition of irrational numbers, the 
end points of nested intervals were assumed to be rational, we must 
now remove such restrictions and show that we can operate with real 
numbers exactly as we do with rational numbers. In carrying out this 
program we have to be careful at each step to avoid reliance on facts 
not yet proved by logical deduction from our basis of departure, the 
rational numbers. 

We shall denote real numbers by letters x, y,.... If the real number 
x is given by the nested sequence of rational intervals [a,, b„], we write 
x ~ {[a,, 5,]}. From our definition of real number we draw a natural 
definition of order for a real number z ~ {[a,, b„]} relative to a rational 
number r. We say that r < x, r = x, r > x according as r belongs to 
the first, second, or third class of the separation of the rational numbers 
generated by the sequence of nested intervals. This definition is obvi- 
ously independent of the special nested sequence {[a,,, b,]} defining x 
and has the ordinary meaning when z= is rational. Equivalently, we 
say that r < x ifr < a, for almost all n, r = x if a„ <r < b, for all n, 
and r > x ifr > b, for almost all n. 

By comparing real numbers with rational numbers we can compare 
real numbers with each other. Let x ~ {[an, bnl}, yY ~ {[a,, bnl}. We 
say x < y if there is a rational number r such that z < r < y. Clearly, 
this definition does not depend on the particular representations of 
x and y by nested sequences since comparisons with rational r are 
independent of such representations. Thus we say that x < y if there 
exists a rational r such that b, < r < a, for almost all n, or simply 
if b,, < «,, for almost all n. The relation x < y precludes the possibility 
that y < x or x = y. Obviously x < y and y < z implies x < z. 

For any two real numbers z and y, one of the relations z < y, 
x =y, y < x must hold. For if x # y and either number, say y, is 
rational, then y must be in the first or third class of the separation 
generated by x, that is, either y < x or x < y. If neither x nor y is 
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rational, the second classes of the corresponding subdivisions are 
empty, and there must be a rational number r in the first class with 
respect to one of the numbers and in the third class with respect to the 
other. Thus either x < y or y < z. 

Density. An immediate consequence of these definitions is the 
density of the rational numbers in the sense that between any two real 
numbers z, y there is always a rational number r. We also observe that 
if a real number x is represented by a nested sequence of rational 
intervals [a,, b„], then a, < x <b, forall n. For if x < am for some 
m, then 6, < a,, for almost all n, contradicting the inequality a,, < bn 
which holds for all n. Hence every real number can be confined to a 
rational interval [a,, b„] of arbitrarily small length. 

Once the real numbers are ordered we can talk of intervals with real 
end points. The density of the rational numbers guarantees that every 
such interval includes rational numbers. 

Limits. A real number v is called the /imit of a sequence 2, %,... 
of real numbers if every open interval with real end points containing x 
also contains x, for almost all n. This definition is consistent with the 
definition in terms of rational intervals given earlier, in the sense that 
a rational limit of a rational sequence is a limit of the same sequence 
of numbers in the more general sense of a real limit. As a consequence 
of the definition of limit we find that for a real number x represented 
by a nested sequence of rational intervals [a,, 5,] 


x = lima, = lim b,. 


n—> œ n— 0 


* Arithmetic. We next define the arithmetic operations for real 
numbers z ~ {[a,, b,]} and y ~ {[x,, B,J}: This is achieved most easily 
for the operations of addition and subtraction. We define 


r HY~N {lan + Oy, b, F B,J}; ti om {lan zi Pa b,n = Anly- 
To prove these definitions meaningful is a simple exercise whose details 
are left to the reader (see Problem 3, p. 116). For example, for x — y it 
is necessary only to verify the intervals [a, — Ên, bn — a] form a nested 
sequence with lengths tending to zero, and hence that they represent 
a real number z. The fact that z does not depend on the special repre- 
sentations of x and y is proved by characterizing the separation of 
rational numbers into three classes generated by z directly in terms of 
x and y; for instance, the first class consists of the rational numbers: 
r < 2, or of the r which are exceeded by a, — 8, for some n; these r 
are easily seen to be the rational numbers of the form s — t, where s 
and ¢ are rational numbers for which s < x and t > y. 
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The product of the two real numbers 2, y is for y > 0 defined by 


rey~ {[4n&ns babal}, 


where we have assumed that all «, >œ 0; it is obvious what nested 
sequences are proper to use for zy in the case y < 0 and y = 0. When- 
ever y is a positive rational number, the product xy also is representable 
in the form 


x y~ {[a,y, b,yl}. 


For a natural number y = m, the product 7: y = mz also can be 
obtained by repeated addition of x, that is, mz = x + (m — 1)x = 
tute +2, 

The arithmetic operations obey the usual laws. In particular, the 
relation x < y is equivalent to O < y — x. We can introduce the 
absolute value of a real number and prove the triangle inequality 
lz + y| < |z| + ly]. The notion of limit of a sequence of real numbers 
defined above in terms of order relations can then be given the equivalent 
formulation: x = lim z„ if for every real positive e the relation 


jz — x,| < € holds for almost all n. 
We now verify the so-called 


AXIOM OF ARCHIMEDES. If x and y are real numbers and x is positive, 
then there exists a natural number m such that mx > y. 


In essence this means a real number cannot be “infinitely small” or 
“infinitely large” compared with another (except if one of them is zero). 
To prove the Axiom of Archimedes (which in our context is really a 
theorem) we observe that for rational numbers it is a consequence of 
the common properties of integers. If now x ~ {[a,, b,]} and y ~ 
{[{x,,, 8,]} are real numbers and 2 is positive, then a, > 0 for almost all 
n. Since a, and f, are rational numbers, we can then find an m so large 
that ma, > n, whence mz > bn > Y. 


d. Completeness of the Number Continuum. Compactness 
of Closed Intervals. Convergence Criteria 


Real numbers make possible limit operations with rational numbers, 
but they would be of little value if the corresponding limit operations 
carried out with them necessitated the introduction of some further 
kind of “unreal? numbers which would have to be fitted in between 
the real ones, and so on ad infinitum. Fortunately, the definition of 
real number is so comprehensive that no further extension of the 
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number system is possible without discarding one of its essential 
properties.(as “order” must be discarded for complex numbers). 


Principle of Continuity 


This completeness of the real number continuum is expressed by 
the basic continuity principle (cf. p. 8): Every nested sequence of 
intervals with real end points contains a real number. To prove this, 
consider closed intervals [x,, y,,], each interval contained in the preceding 
one, whose lengths y, — x, form a null-sequence. We claim there is a 
real x contained in all [z,, y,]: The sequences x, and y,, will then have 
x as limit. To prove this we replace the nested sequence [z,, y,] by a 
nested sequence of rational intervals [a,, b„], containing the [z,, y,]. 
This rational sequence will then define the desired real number x. For 
each n let a, be the largest rational number of the form p/2” less than 
x,, and b, the smallest rational number of the form q/2” greater than 
Yn, where p and q are integers. Clearly, the intervals [a,,b,] form a 
nested sequence representing a real number x. If x lay outside one of 
the intervals [z,,, Ym], say £ < Xm, there would exist a rational r with 
x <L F < £m whence for all sufficiently large n we would have 


Yn SO, <P <I SX, 
which is impossible. Hence all intervals [z,,, Ym] contain the point z. 
Weierstrass’ Principle—Compactness 


Several other versions of this principle of continuity are important. 
The first is the Weierstrass principle of existence of limit points or 
accumulation points of bounded sequences. A point x is a limit point of 
a sequence x, X, ... if every open interval containing x also contains 
points x, for infinitely many n. Notice the difference between this 
definition and the definition of limit, where the x, for almost all n must 
lie in the open interval, or for all n with at most a finite number of 
exceptions or for all sufficiently large n. If a sequence has a limit, then 
this limit is also a limit point of the sequence and is in fact the only one. 
There may be no limit point (as in the example of the sequence 1, 2, 3, 
4,...) or a single limit point (as in a convergent sequence) or several 
limit points (for example, the sequence 1, —1, 1, —1,... has the two 
limit points +1 and —1). The Weierstrass principle asserts: Every 
bounded sequence has at least one limit point. 

To prove this we observe that since the sequence z, zz, . . . is bounded, 
there exists an interval [y,, z,] containing all z,. Starting with [y,, 21] 
we construct by induction over n a nested sequence of intervals [y,,, Z,] 
each containing points £» for infinitely many m. If [y,, z,] contains 
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infinitely many x,,, we divide [y,, 2,] into two equal parts by its mid- 
point. At least one of the two resulting closed intervals must again 
contain infinitely many z,, and can be taken as the interval [y,,.1, 2,41]. 
It is clear that the [y,, z,] form a nested sequence representing a real 
number x. Every open interval containing v will contain the intervals 
[Yn Zn] for sufficiently large n and hence must contain infinitely many zm. 

Limit points can also be defined as limits of subsequences of the given 
infinite sequence z, 7,,.... A subsequence is any infinite sequence 
extracted from the given sequence, or of the form Bis Pach EEE 
where n; < ng < na < +++. Obviously, a point x is a limit point of the 
sequence Tı, zə... if it is limit of some subsequence. Conversely, 
for any limit point x we can, by induction, construct a subsequence 
Ln» Tny --- Converging to x. Ifa, ,..., %,, , are defined already we 
take for n, one of the infinitely many integers n for which n > np- 
and |z, — z| < 2-*. 

We restate the Weierstrass principle in the form: 


THEOREM. Every bounded infinite sequence of real numbers has a 
convergent subsequence. 


A set is called compact if every sequence of its elements contains a 
subsequence converging to an element of the set. Rephrasing our 
theorem we say that closed intervals of real numbers are compact sets. 


Monotone Sequences 


A special consequence of this theorem is that every bounded 
monotone sequence converges. Indeed, let the sequence 2, %,... be 
monotone, say monotonic increasing. If the sequence is also bounded, 
it has a limit point x. Arbitrarily close to x there must be points ~, of 
the sequence, none exceeding x, since the subsequent terms increase, 
and if x, > x then £m >2, >x for m>n. It follows that every 
interval containing x contains almost all x,, or x is the limit of the 
sequence. 


Cauchy's Convergence Criterion 


The condition that a sequence is bounded and monotone is sufficient 
for convergence. The significance of this statement is that it often 
permits us to prove existence of the limit of a sequence without 
requiring a priori knowledge of the value of the limit; in addition, 
boundedness and monotonicity of a sequence are properties usually 
easy to check in concrete applications. However, not every convergent 
sequence need be monotone (although it has to be bounded) and it is 
important to have a more generally applicable criterion for convergence. 
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Such is the intrinsic convergence test of Cauchy which is a necessary 
and sufficient condition for the existence of the limit of a sequence. 


The sequence x,, £z, £3, . . . converges if and only if for every positive e 
there exists an N such that |x,, — x,,| < € for all n and m exceeding N. 


In other words, a sequence converges if any two of its elements with 
sufficiently large indices differ by less than e from each other. 

We claim that the condition is necessary for convergence. If 
x = lim x, then every z,, with sufficiently large n differs from x by less 
than ¢/2, and hence by the triangle inequality every two such values z, 
and z,, will differ from each other by less than e. Conversely, consider 
a sequence for which |z, — z,,| < € for any e > 0, for all sufficiently 
large n and m. Then there exists a value N such that almost all z,, differ 
from zy by less than 1. This means that almost all x„ can be enclosed 
in an interval of length 2. We can then find an interval so large that it 
includes also the finite number of x, which may lie outside the interval 
about xy. Thus the sequence is bounded and hence has a limit point z. 
Every open interval containing x will also contain some points z,, with 
arbitrarily large m. Since points 2, differ arbitrarily little from 
each other for sufficiently large n, it follows that the open interval 
about x must contain almost all z,, and so x is the limit of the sequence. 


e. Least Upper Bound and Greatest Lower Bound 


It is of great importance that a bounded set of real numbers has 
“best possible” upper and lower bounds. A set S of real numbers = is 
bounded, if all numbers of S can be enclosed in one and the same finite 
interval. There are then upper bounds of S, numbers B which are not 
exceeded by any number x of S: 


x<B for all x in S. 
Similarly, there are lower bounds A of S: 
A<2 for all x in S. 


Thus for the set of reciprocals of natural numbers 1, 4, 3, 7,..., any 
number B > | is an upper bound, any number A < 0, a lower bound; 
here the number 1, a member of the set is the least upper bound, and 
the number 0, a limit point of the elements of the set although not a 
member, is the greatest lower bound. The least upper bound of a set 
of real numbers is often called its supremum, the greatest lower bound 
its infimum. In general the supremum and infimum of a set are either 
members of the set or at least limits of sequences of members of the 
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set. For, if the least upper bound b of S does not belong to S, there 
must be some members of S lying arbitrarily close to 5, since otherwise 
we could find upper bounds of S smaller than b; thus we can select 
successively a sequence of numbers 2,, x2,... from S which lie closer 
and closer to b and converge to b. 

The existence of a least upper bound of a bounded set S follows 
immediately from the convergence of monotone bounded sequences. 
For any n we define B, as the smallest rational upper bound of S with 
denominator 2”. Clearly, for any x in S and any n 


xt Brai S BLS B. 


Thus the B, form a monotonically decreasing and bounded sequence 
which must have a limit b. It is easy to see that b is an upper bound of 
S and that there exists no smaller upper bound. The existence of the 
greatest lower bound is proved in the same way. 


f. Denumerability of the Rational Numbers 


A surprising discovery concerning the rational numbers was made 
late in the nineteenth century and stimulated the creation by Georg 
Cantor of the Theory of Sets after 1872. Although the rational numbers 
are dense and cannot be ordered by size, they can be arranged never- 


theless as an infinite sequence r4, F2, . . . , Fns - . . in which every rational 
number appears once. In this way the rational numbers can be 
enumerated, or counted off, as a first, second,..., mth,... rational 


number, where, of course, the order of the numbers in the sequence 
does not correspond at all to their order by magnitude. This result, 
which holds just as well for the rational numbers in any interval, is 
expressed by the statement: The rational numbers are denumerable, or 
they form a denumerable set. 

To prove this result we simply give a prescription for arranging 
the positive rational numbers as a sequence. Every such number can 
be written in the form p/q, where p and q are natural numbers. For 
each positive integer k there are exactly k — 1 fractions p/q for which 
p+4qz2=k. These are arranged in order of increasing p. Writing the 
different arrays of numbers for k = 2, 3, 4,... successively, we obtain 
(see Fig. 1.8.1) a sequence which contains all positive rational numbers. 
Omitting fractions, in which numerator and denominator have a 
common factor greater than 1, and thus represent the same rational 
number as a previous fraction, we obtain the sequence 
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in which every positive rational number occurs exactly once. A 
similar sequence containing all rational numbers or all rational numbers 
in some particular interval is easily constructed. 

This result is seen in proper perspective only in the light of another 
basic fact: that the set of all real numbers is not denumerable.! This 


Figure 1.8.1 Denumerability of the positive rationals. 


is an indication that the set of real numbers contains “many more” 
elements than that of the rational numbers, although both sets are infi- 
nite; thus denumerability is indeed a highly restrictive property of a set. 

The Theory of Sets plays an important clarifying role in mathe- 
matics, although its use in unrestricted generality has led to paradoxical 
results and controversies. Such paradoxes, however, do not affect the 
substance of constructive mathematics and are absent from the theory 
of sets of real numbers. 


S.2 Theorems on Continuous Functions 


Important properties of continuous functions are established on the 
basis of the completeness property for real numbers. We recall the 
definition of continuity: the function f(x) is continuous at the point & 
if for any given positive e the inequality | f(x) — f(§)| < e holds for 


1 For proof and a brief general discussion of the basic facts of set theory see 
What is Mathematics? by Courant and Robbins, p. 81. 


100 Introduction Ch. 1 


all x sufficiently close to £, or, for all x differing from & by less than a 
suitable quantity 6, which generally depends on the choice of e and £. 
It is understood in this definition that only values of x and & for which 
J is defined are considered. 

A more concise definition of continuity in terms of convergence of 
sequences is: f(x) is continuous at the point & if lim f(x,) = f(&) for 


n— a 


every sequence X, Xg,... with limit ë (where again the values x, and & 
are in the domain of f). The equivalence of the two definitions was 
proved in Section 1.8, p. 82. 

We call f continuous in an interval if f is continuous at each point of 
the interval. f(x) is uniformly continuous if for given «e > 0 we have 
| f(z) — f(§)| < «e whenever x and ¢ are sufficiently close regardless of 
their location in the interval; thus f is uniformly continuous if the 
quantity ô appearing in the definition of continuity can be chosen in- 
dependently of £: For every e > 0 there exists a ô = d(e) > 0 such 
that | f(x) — f(§)| < e whenever |x — ¿| < 6. For practical purposes 
this means that if we subdivide the interval in which f is defined into a 
sufficiently large number of equal subintervals, then f will vary by less 
than a prescribed amount e in each subinterval: At any point, f will 
then differ by less than e from its value at any other point of the same 
subinterval. 


We now prove: Every function continuous in a closed interval [a, b] 
is uniformly continuous in that interval. 


If f were not uniformly continuous in [a, b], there would exist a fixed 
e > 0 and points z, £ in [a, b] arbitrarily close to each other for which 
| f(x) — f(€)| > «. It would then be possible for every n to choose 
points z,, „in [a, b] for which | f(z,) — f(é,)| > cand |x, — Enl < 1/n. 
Since the z, form a bounded sequence of numbers we could find a 
subsequence converging to a point 7 of the interval (using the compact- 
ness of closed intervals). The corresponding values &, would then also 
converge to 7: since f is continuous at 7, we would find that 7 = 
lim f(z,) = lim f(€,) for n tending to infinity in the subsequence, which 
is impossible if | f(x) — f(&,)| > € for all n. 

The intermediate value theorem asserts: If for a function f(x) con- 
tinuous in an interval a < x < b, y is any value between f(a) and f(b), 
then f(¢) = y for some suitable $ between a and b. Thus the existence 
of a solution & of the equation ft&) = y is certain if one exhibits two 
values a and b for which f(a) < y and f(b) > y respectively. This 
immediately implies the existence of a uniquely determined inverse 
function if f is continuous and monotonic, as we have seen (p. 44). 
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To prove the intermediate value theorem let a < b, f(a) =a, 
f(b) = B, anda < y < B. Let S be the set of points x of the interval 
[a, b] for which f(x) < y. Sis bounded and has a least upper bound é 
also belonging to the closed interval [a,b]. Then f(z) >y for 
<x <b. The point é either belongs to S or is the limit of a sequence 
of points z, of S. In the first case f(&) < y; hence £ < b, since 
f(b) > y, and there are points x between é and b, arbitrarily close to & 
for which f(x) > y. This is impossible if f is continuous at & and 
f(§) < y. In the second case, f(&) > y, we find from f(z,) < y and 

lim xz, = € that f(¢) < y; since we saw already that f(E) < y is 


impossible, we must have f/(&) = y. 

A third basic property of a continuous function f(z) in a closed 
interval [a, b] is the existence of a largest value (maximum), meaning 
that there exists a point ¢ in the interval [a, b] such that f(x) < f (£) for 
all x in the interval. Similarly, f will assume its /east value (minimum) 
at some point 7 of the interval: f(x) > f(n) for all x in the interval. It 
is essential to have the interval closed: for example, the functions 
f(x) = x or f(x) = 1/z are continuous, but they do not have a largest 
value in the open interval 0 < x < 1; the maximum may just occur 
at one of the end points or not exist at all if fis not continuous at the 
end points. 

To prove this principle we observe that a function f continuous in 
[a, b] is necessarily bounded: that is, the values f (x) forming the “range” 
S of f lie in some finite interval. Indeed by the uniform continuity of f 
we can find a finite number of points z1, £2, . . . , x, in the interval such 
that f(z) at any x of the interval differs by less than one from one of the 
numbers f(x), f(%2),..., f(x,) which can all be fitted into a finite 
interval. Since then the set S of values f(x) is bounded, it has a least 
upper bound M. This M is the smallest number such that f(x) < M 
for all z in (a, b]. Either M belongs to S or is the limit of a sequence of 
points of S. In the first case, there exists a & in [a, b] with f(€) = M. 
In the second case, there exists a sequence of points x, in [a, b] with 
lim f(z,) = M; thus we can find a subsequence of the x, which con- 


verges to a point £ of [a, b] and again f(£) = M by continuity of fat &. 
Clearly, f(&) is the maximum of f. 


S.3 Polar Coordinates 


In Chapter 1 we have represented functions geometrically by curves. 
Analytical geometry follows the reverse procedure, beginning with a 
curve and representing it by a function, for example, by a function 
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expressing one of the coordinates of a point of the curve in terms of 
the other. This point of view naturally leads us to consider, in addition 
to the rectangular coordinates to which we restricted ourselves, other 
systems of coordinates possibly better suited for the representation of 
curves given geometrically. The most important example is that of 
polar coordinates r, 0 connected with the rectangular coordinates x, y 
of a point P by the equations 
y 


x=rcos#, y =rsin0, =z +4, tand=-, 
ax 


whose geometrical interpretation is made clear in Fig. 1.8.2.1 


Figure 1.8.2 Polar coordinates. 


We consider, for example, the /emniscate. This is geometrically 
defined as the locus of all points P for which the product of the distances 
r, and r, from the fixed points F, and F, with the rectangular coordinates 
x =a, y = 0 and x = —a, y = 0 respectively, has the constant value 
a? (cf. Fig. 1.8.3). Since 


r= (x — a} +y’, rè = (£ +a? +y, 
a simple calculation gives us the equation of the lemniscate in the form 
(2? + yP} — 2a%(x* — y?) = 0. 
Introducing polar coordinates, we obtain 
rêi — 2@æèr?(cos? 0 — sin? 0) = 0; 


1 The polar coordinates are not completely determined by the point P. In addition 
to 0, any of the angles 0 + 27, 6 + 47, ... can be considered a polar angle of P. 
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Figure 1.8.3. Lemniscate. 


dividing by r? and using a simple trigonometrical formula this becomes 
r? = 2a? cos 20. 


Thus the equation of the lemniscate is simpler in polar coordinates 
than in rectangular. 


S.4 Remarks on Complex Numbers 


Our studies will be based chiefly on the continuum of real numbers. 
Nevertheless, with a view to discussions in Chapters 7, 8, and 9, 
we remind the reader that the problems of algebra have led to a still 
wider extension of the concept of number, the complex numbers. The 
advance from the natural numbers to the real numbers arose from the 
desire to eliminate exceptional phenomena and to make certain 
operations, such as subtraction, division, and correspondence between 
points and numbers, always possible. Similarly, we are compelled by 
the requirement that every quadratic equation and in fact every algebraic 
equation shall have a solution, to introduce the complex numbers. If, 
for example, we wish the equation 


L +1=0 


to have roots, we are obliged to introduce new symbols i and —i as 
the roots. (As is shown in the theory of functions of a complex variable, 
this is sufficient to insure that every algebraic equation shall have a 
solution.?) 


1 An algebraic equation is of the form P(x) = 0, where P is a polynomial with 
complex coefficients. 
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If a and 6 are ordinary real numbers, the complex number c = a + ib 
denotes a pair of numbers (a, b) with which calculations are performed 
according to the following general rule: We add, multiply, and divide 
complex numbers (among which the real numbers are included as the 
special case b = 0), treating the symbol į as an undetermined quantity, 
and simplify all expressions using the equation i? = —1 to remove all 
powers of i higher than the first, leaving only an expression of the form 
a + ib. 

We assume that the reader already has a certain degree of familiarity 
with the complex numbers. We nevertheless emphasize a particularly 


Figure 1.8.4 Geometric representation of a complex number x + yi and of its 
conjugate. 


important relationship which we shall explain in connection with the 
geometrical or trigonometrical representation of the complex numbers. 
If c = x + iy is such a number, we represent it in a rectangular co- 
ordinate system by the point P with coordinates x and y. By means of 
the equations z = r cos 0, y = r sin 0, we introduce the polar coordinates 
r and 0 (cf. p. 101) instead of the rectangular coordinates x and y. Then 
r = Vz? + y? is the distance of the point P from the origin, and @ the 
angle between the positive x-axis and the segment OP. The complex 
number c is represented in the form 


c = r(cos 0 + isin 0). 


The angle 0 is called the amplitude of the complex number c, the 
quantity r its absolute value or modulus, for which we also write |e]. 
To the “conjugate? complex number č = x — iy there obviously 
corresponds the same absolute value, but the amplitude —6 (Fig. 1.S.4). 
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Clearly, 


r? = lel? = cé = x? + y?. 
If we use this trigonometrical representation, the multiplication of 
complex numbers takes a particularly simple form, for then 
c+ c' =r(cos 0 + isin 6): r'(cos 0’ + isin 6’) 
= rr'[(cos 6 cos 6’ — sin 0 sin 6’) + i(cos 0 sin 0’ + sin 0 cos 6’)}. 
If we use the addition theorems for the trigonometric functions, this 


becomes cc’ =rr'(cos (6 + 6’) + isin (0 + 0’)). 


Figure 1.8.5 The nth roots of unity (for n = 16). 


We therefore multiply complex numbers by multiplying their absolute 
values and adding their amplitudes. The remarkable formula 


(cos 0 + isin 6)(cos 6’ + isin 0’) = cos (0 + 6’) + isin (0 + 6’) 
is usually called De Moivre’s theorem. It leads us to the relation 
(cos 0 + isin 6)” = cos n0 + isin n0, 


which, for example, at once enables us to solve the equation x” = 1 for 
positive integers n; the roots (the so-called roots of unity) are 


27 o een A 4n . .. 4r 
e& = e = cos — + i sin =, e€ = € =cos—+isin—,..., 
n n n n 
z 2(n — 1)r . . An — l)r 
€,-1 =E” ia pee e,=e=1 
n n 


(Fig. 1.8.5). 
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SECTION 1.1c, page 9 
1. Let [x] denote the integer part of x; that is, [x] is the integer satisfying 


x—-1<{r] <r. 
Set cy = [x], and c, = [10% — cy) — 10% 4c, — 10% %c, — -- + — 10¢,_3] 
for n = 1, 2, 3,.... Verify that the decimal representation if 2 is 


«= Co +O: CC's 
and that this construction excludes the possibility of an infinite string of 9’s. 


2. Define inequality x > y for two real numbers in terms of their decimal 
representations (see Supplement, p. 92). 


*3. Prove if p and q are integers, q > 0, that the expansion of p/q as a 
decimal either terminates (all the digits following the last place are zeros) 
or is periodic; that is, from a certain point on the decimal expansion consists 
of the sequential repetition of a given string of digits. For example, } = 0.25 
is terminating, ,°; = 0.090909 --- is periodic. The length of the repeated 
string is called the period of the decimal; for 3’; the period is 2. In general, 
how large may the period of p/q be? 


SECTION 1.le, page 12 


1. Using signs of inequality alone (not using signs of absolute value) 
specify the values of x which satisfy the following relations. Discuss all cases. 

(a) |x — a| < |x — DI. 

(b) |x — a| <a — b. 

(c) |x? — al < b. 


2. An interval (see definitions in text) may be defined as any connected 
part of the real continuum. A subset S of the real continuum is said to be 
connected if with every pair of points a, b in S, the set S contains the entire 
closed interval [a,b]. Aside from the open and closed intervals already 
mentioned, there are the “half-open” intervals a <x < b and a < x < b 
(sometimes denoted by [a, b) and (a, b], respectively) and the unbounded 
intervals that may be either the whole real line or a ray, that is, a “half-line” 
x La,x <a,x >a,x > a (sometimes denoted by (— œ, œ)] and (— œ, a], 
(— œ, a), (a, ©), [a, ©), respectively) (see also footnote, p. 22). 

*(a) Prove that the cases of intervals specified above exhaust all possibilities 
for connected subsets of the number axis. 

(b) Determine the intervals in which the following inequalities are satisfied. 

(i) x? —3x +2 <0. 
(ii) (x — a(x — bx —c) > 0, fora <b <c. 
(iii) |1 — x| — x > 0. 


: x — qA 
Wag 
1 
(v) +i 
x 


(vi) [x] < 2/2. See Problem 1 of this page. 
(vii) sinx > V2/2. 
(c) Prove ifa <x < b, then |x| < Ja] + Ibl. 
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3. Derive the inequalities 
i 
(a) x To > 2, for z > 0, 


()e+-< —2, for x <0, 


(c) +: > 2, forz #0. 


4. The harmonic mean ¢ of two positive numbers a, b is defined by 


oe are 
E 2\a bp 


Prove that the harmonic mean does not exceed the geometric mean; that is, 
that £ < Vab. When are the two means equal? 

5. Derive the following inequalities: 

(a) £? + xy +y? 20, 

*(b) een + gin-ly + gny? +--+ yen > 0, 

*(c) xt — 3x? + 4r? — 32 +120. 
When does equality hold? 

*6. What is the geometrical interpretation of Cauchy’s inequality for 
n= 2,3? 

7. Show that the equality sign holds in Cauchy’s inequality if and only 
if the a, are proportional to the 6,: that is, ca, + db, = 0 for all v where c 
and d do not depend on » and are not both zero. 


8. (a) |x — a| + [x —a,| + |£ — a| > a} — ay, for ay < ay < ag. 
For what value of x does equality hold? 
*(b) Find the largest value of y for which for all x 

|z — a! + |x — ay] +e leal ey, 


where a, < a < ``- < an. Under what conditions does equality hold? 


9. Show that the following inequalities hold for positive a, b, c. 
(a) œ + b? + e > ab + be + ca. 

(b) (a + bb + cXc +a) > 8abc. 

(c) a®b? + be + cea > abc(a +b +c). 


10. Assume that the numbers £i, £} x, and a,, (i,k = 1, 2,3) are all 
positive, and in addition, ay, < M and z}? + x2 + x? < 1. Prove that 


a3? + Ayo% 2, +++ + azta? S 3M. 


*11. Prove the following inequality and give its geometrical interpretation 
forn < 3, 


v(a z b,)? A Ay a bn)? sS V (a? AAT an’) + V(b) Poar bp’). 
12. Prove, and interpret geometrically for n < 3, 


Va, +b +: tza) te Hla, thy +e +e) 
S Var +t e tap t VEZ + e tbe te Va HR 


Problems 109 


13. Show that the geometric mean of n positive numbers is not greater 
than the-arithmetic mean; that is, if a; > 0 (i = 1,..., n), then 


es: | 
“ayaz: `` an S> (a +a, +’: +a). 


(Hint: Suppose a, S ag S `: San. For the first step replace a, by the 
geometric mean and adjust a, so that the geometric mean is left unchanged.) 
SECTION 1.2d, page 31 


1. If f(x) is continuous at x = a and f(a) > 0, show that the domain of 
f contains an open interval about a where f(x) > 0. 
2. In the definition of continuity show that the centered intervals 
If(x) —f(%)| <e and |x — zol < ô 
may be replaced by an arbitrary open interval containing f(xọ) and a suffi- 
ciently small open interval containing xo, as indicated on p. 33. 


3. Let f(x) be continuous for 0 <x <1. Suppose further that f(x) 
assumes rational values only and that f(x) =$ when v = 4. Prove that 
f(x) = 4 everywhere. 


4. (a) Let f(x) be defined for all values of x in the following manner: 


0 x irrational 


feas 


1, x rational. 


Prove that f(x) is everywhere discontinuous. 
(b) On the other hand, consider 


g(x) = 


0, x irrational 
1 


, x = rational in lowest terms. 
q 


q 


(The rational number p/q is said to be in lowest terms if the integers p and 
g have no common factor larger than 1, and g > 0. Thus (16/29) = 1/29.) 
Prove that g(x) is continuous for all irrational values and discontinuous for 
all rational values. 


*5. If f(x) satisfies the functional equation 


fle +y) =f) + fy) 


for all values of x and y, find the values of f(x) for rational values of x and 
prove if f(x) is continuous that f(x) = cx where c is a constant. 


6. (a) If f(x) = 2", find a ô which may depend on § such that 
f(z) —f(| < e 
jz — El <0. 
*(b) Do the same if f(x) is any polynomial 


fœ) = a,x” + ap 2"? ae ar aX F Qo» 


whenever 


where a, # 0. 
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SECTION 1.2e, page 44 


1. Prove that if f(x) is monotonic on [a, b] and satisfies the intermediate 
value property, then f(x) is continuous. Can you draw the same conclusion 
if f is not monotonic? 


2. (a) Show that x” is monotonic for x > 0. As a consequence, show for 


a > O that z” = a has a unique positive solution v'a. 
(b) Let f(x) be a polynomial 


f(t) = an2” + a,x"! +--+ + a,x + a, (a, # 0). 


Show (i) if n is odd, then f(x) has at least one real root, (ii) if a, and ay have 
opposite signs, then f(x) has at least one positive root, and, in addition, if 
nis even, n # 0, then f(x) has a negative root as well. 


*3. (a) Prove that there exists a line in each direction which bisects any 
given triangle, that is, divides the triangle into two parts of equal area. 
(6) For any pair of triangles prove that there exists a line which bisects 
them simultaneously. 


SECTION 1.3b, page 49 


1. (a) Prove that Vx is not a rational function. (Hint: Examine the 


possibility of representing Va as a rational function for x = 7,2. Use the 
fact that a nonzero polynomial can have at most finitely many roots.) 


(b) Prove Wx is not a rational function. 
SECTION 1.3c, page 49 


1. (a) Show that a straight line may intersect the graph of a polynomial 
higher than first degree in at most finitely many points. 

(b) Obtain the same result for general rational functions. 

(c) Verify that the trigonometric functions are not rational. 


SECTION 1.5, page 57 


1. Prove the following properties of the binomial coefficients. 
n n n N\ an 

@1 + (i) + (3) + afaa +G) 
n 'n n ndn\ _ 

(6) 1 — (i) + (3) - (3) ++ (-1) (") =0. 


(c) (i) + 2(5) + 3(5) +c tn (") =n(2"-'). (Hint: Represent 


the binomial coefficients in terms of factorials.) 


(d) | 2(3) + 2-3(%) +e +(n —1)n (") = n(n — 1)2"-2, 


1 fn\ , 1 fn 1 fn) ante 
A e +" tear) <Sar 
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n\ n? n? 2n : 
*(f) (o) + (7) ++ (") = ( a). (Hint: Consider the coefficient 
of x” in (1 + x)?”.) 


' _ _{n\ lin l /n l fn wa A= n 
i) s= -(5)- 3 (7) +3) -36) + tapi (n) 


_ 4n!’ 
~ Qn +1)!" 


2 
(Hin: Prove mn 43 Sn = Su 


2. Prove (1 + x) > 1 + nx, for x > —1. 

3. Prove by induction that! +2 +--+: +n =4n(n +1). 

*4. Prove by induction the following: 

1 — (n + 1)g” + ng" 
(1 —q)* 


OU +90 +D 0 +g. 


5. Prove for all natural numbers n greater than 1 that n is either a prime 
or can be expressed as a product of primes. (Hint: Let A,_, be the assertion 
for all integers k with k < n that k is either prime or a product of primes.) 


(a) 1 +29 +38? +: + ng = 


*6. Consider the sequence of fractions 
137 Pn 
V2 5 ge 
where Payı = Pn + 24n and Gni1 = Pn + Qn: 

(a) Prove for all n that p,,/q, is in lowest terms. 

(b) Show that the absolute difference between p,,/g, and v2 can be made 
arbitrarily small. Prove also that the error of approximation to V2 alternates 
in sign. 

7. Let a, b, a, and b, be integers such that 


(a + bV2)" =a, +b, V2, 


where a is the integer closest to bV2. Prove that a,, is the integer closest to 
b,, V2. 
*8. Let a, and bẹ be defined by 


ay = 3, An41 = 3an, and bi = 9, bnga = Qbr, 


For each value of n, determine the minimum value m such that am > bn. 
9. If n is a natural number, show that 


(1 + V5)" ~(1 — V5)" 
arv/5 
is a natural number. 
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10. Determine the maximum number of pieces into which a plane may 
be cut by n straight lines. Show that the maximum occurs when no two of 
the lines are parallel and no three meet in a common point, and determine 
the number of pieces when concurrences and parallelisms are permitted. 


11. Prove for each natural number n that there exists a natural number k 
such that T B 
(V2 —1} = Vk — Vk —1. 


12. Prove Cauchy’s inequality inductively. 
SECTION 1.6, page 60 
1. Prove that lim (Vn + 1 — Vn Vn +4) =}. 


n— 0 


2. Prove that lim (Yn +1 — Wn) =0. 
n— © 
3. Leta, = 10"/n!. (a) To what limit does a, converge? (b) Is the sequence 
monotonic? (c) Is it monotonic from a certain n onward? (d) Give an 
estimate of the difference between a,, and the limit. (e) From what value of 
n onward is this difference less than 1/100? 


! 
4. Prove that lim — = 0. 


I 2 n 
5. (a) Prove maram (5 + 2 Spo E z) =}. 
(b) Prove that lim l + : +: + ! = 0. (Hint: Compare 
nro \n? (n+ 1) (2a) : P 


the sum with its largest term.) 


Vn Vn +1 
*(d) P that li eae at + + ——— ] = 1 
enea r væn) 


6. Prove that every periodic decimal represents a rational number. 
(Compare Section 1.1c, Problem 3.) 


] 1 I 
c) Prove that lim | —= + —=== +: +—=] = o, 
G (z 5) 


n-» æ 


100 
, n : LA 
7. Prove that lim or exists and determine its value. 
n-ro he 
8. Prove that if a and b < a are positive, the sequence Wa" +b” converges 
toa. Similarly, for any k fixed positive numbers a}, az, . . ., a, prove that 
Wa,” +a +: +a,” converges and find its limit. 


9. Prove that the sequence ^ 2, V2 AnD: J 2/2 v2, ..., converges. Find 
its Jimit. 
10. 1f x(n) is the number of prime factors of n, prove that 
._ (n) 
lim — = 


nO n 


0. 
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11. Prove that if lima, = &, then lim c, = $, where o, is the arithmetic 


7 n— 0 N—-+ © 
mean (a, + a, +: + a)n. 
12. Find 
Olin | ee E 
a 23 n(n + 1)/7 
1 1 l 
Hi Corr E — Á eo, 
(Hinr kk+D k rT) 
| 1 l À 
b) | eee ——————————— 
O lim (35 +3454 t oa 


13. Ifa +a, +: +a, = 0, prove that 
lim (aY n + avyn+l + +a,vn +p) =0. 


n— © 


(Hint: Take V n out asa factor.) 
14. Prove that lim "+? V (n? + n) = 1. 
n— @ 


*15. Let a, be a given sequence such that the sequence b, = pa, + qan4y, 
where |p| <q, is convergent. Prove that a, converges. If |p| >q > 0, 
show that a, need not converge. 


16. Prove the relation 

1 2. i 
lim —~ Ce 
an net 2 i k + l 


—» 00 


for any nonnegative integer k. (Hint: Use induction with respect to k and 
use the relation 


n 
> [eet = (i = 1)E+] = nit) 
i=1 


expanding (i — 1)**! in powers of i.) 
SECTION 1.7, page 70 


*1. Let a, and b, be any two positive numbers, and let a, < 6. Let a, 
and b, be defined by the equations 


ay = Vab, by = UTA 
Similarly, let i 
dz = V abg, b = th, 
and, in general, 
On = Van bni, b, = Soca Poot , 


Prove (a) that the sequence a, a», ..., converges, (b) that the sequence 
bi, by,..., converges, and (c) that the two sequences have the same limit. 
(This limit is called the arithmetic-geometric mean of a, and b,.) 
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*2. Prove that the limit of the sequence 
V3, V2 + V3, N2 424 V4... 


(a) exists and (b) it is equal to 2. 


*3. Prove that the limit of the sequence 


tee 
n n+l 


1 
paris 


an = 


exists. Show that the limit is less than 1 but not less than 4. 
4. Prove that the limit of the sequence 
l l 
b = aei Ki 
” n+l ü i 2n 
exists, is equal to the limit of the previous example. 


5. Obtain the following bounds for the limit L in the two previous 
examples: 37/60 < L < 57/60. 


*6. Let a,, b} be any two positive numbers, and let a, < b}. Let 


ay — a, +b, 9 bs —= a,b, 9 
and in general 
ay = eb = Na 
n T ani re Discs > n T n—=]1]n-—-1 ' 
Prove that the sequences a}, a ,,... and b), bə, ... converge and have the 
same limit. 
* l 1 (S 
7. Show that l/e=1 -1 +5- ++ +::-. (Hint: 
Zi 3 n! 


Consider the product of the nth partial sums of the expansions for e and 1/e.) 


8. (a) Without reference to the binomial theorem show thata, = (1 + 1/n)" 
is monotone increasing and 6, = (1 + l/n)"™} is monotone decreasing. 
(Hint: Consider a,,,/a, and b,/b,,,. Use the result of Section 1.5, Problem 
2.) 

(b) Which is the larger number (1,000,000)!:9-00 or (1,000,001 )°99: 999? 


9. (a) From the results of Problem 8a show that 


(J< n! <e(n + TER 
e e 


(b) For n > 6 derive the sharper inequality 


n\n 
ni<n ny. 
e 


*10. Ifa, > 0, and lim ZH = L, then lim va, = L. 


n=» 0 ` ayn n—+ 0 
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11. Use Problem 10 to evaluate the limits of the following sequences: 


(a) Vn , (b) Vn + nî ’ (c) n a 
n 
12. Use Problem 11c to show 
n! = n"e "a,n, 
where a, is a number whose nth root tends to 1. (See Appendix, Chapter 7.) 


13. (a) Evaluate 


oe E. 
1-3 2:4 n(n + 2) 


(Hint: Compare Section 1.6, Problem 12a.) 


ce | 
(b) From the result above, prove that $ — converges. 
k=1 


14. Let p and q be arbitrary natural numbers. Evaluate 
l 

(9) > i(k + pik +p + q)- 

l ] 

b a 

() Dr + pk +p +4) 

15. Evaluate 


ik ae: Seer eee 
1:2-3 2:3-4 n(n + 1)(n + 2)` 


() > k(k + 1)(k +3) 
(c) Evaluate the limit on each of the above expressions as n — œ. 
*(d) Let a,, ay, ..., Am be nonnegative integers with a, < ag <*** < am. 
Show how to obtain a formula for 
n 1 
Sy = ee ae ee et 
> (k +a,)(k +a»): (kK + ay) 


and how to find lim S,. 


n— oO 


16. If a, is monotone and 2 a, converges, show that lim ka, = 0. 


k— 0 


17. Ifa, is monotone es with limit 0 and b, =a; —2@jp44 + ak42 2 9 
for all k, then show > kb, = ay. 
k=] 


SECTION 1.8, page 82 


1. Prove that lim (cos 7x)?” exists for each value of x and is equal to 1 


m-> 2 


or 0 according to whether x is an integer or not. 
2. (a) Prove that lim [lim (cos n! mx)?™] exists for each value of x and 


n— O m—-® 
is equal to 1 or 0 according to whether ~ is rational or irrational. 
(b) Discuss the continuity of these limit functions. 
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3. Let f(x) be continuous for 0 <x <¢ 1. Suppose further that f(x) 
assumes rational values only, and that f(x) = 4 when x = 4. Prove that 
f(x) = 4 everywhere. 


SECTION 1.8.1, page 89 


1. Let r = p/q, s = mjn be arbitrary rational numbers where p, q, m, n 
are integers and q, n are positive. In terms of the integers p, q, m, n, define 


(a)r +s, (b) r =s, (c) rs, (d) L, (e) r <s. 


2. Prove for nested sequences of rational numbers [a,, bn] and [a,’, bn] 
that each of the following conditions is necessary and sufficient for equiv- 
alence: 

(a) a,’ — a, is a null sequence, 

(b) a, S b’ anda,’ S by. 

3. Given x ~ {[a,, bnl}, y ~ (en, Bn]}, (a) verify that the definitions of 
addition and subtraction, 


x+ y = {lan + Any bn + BnD)» xv —y= {[an a Pn» bn rae anl}, 


are meaningful. Specifically, verify that 
(i) the given representations are, in fact, nested sets for x + y and x — y 
when z and y are rational; 
(ii) if x < y, then x +z < y + z, where z is an arbitrary real number. 
(b) Define the product xy and verify specifically that your definition of 
product is meaningful. 
(i) that the given nested set is, in fact, a nested set for cy when x and y 
are rational. 
(ii) that if x < y and z > 0, then xz < yz. 
4. Prove that the following principles are equivalent in the sense that any 
one can be derived as a consequence of any other. 
(a) Every nested sequence of intervals with real end points contains a 
real number. 
(b) Every bounded monotone sequence converges. 
(c) Every bounded infinite sequence has at least one accumulation or 
limit point. 
(d) Every Cauchy sequence converges. 
(e) Every bounded set of real numbers has an infimum and a supremum. 


Miscellaneous Problems 
1. If wy, Wo,..., Wn > 0, prove that the weighted average 
Wy + Wo + eee Wo 
lies between the greatest and the least of the 2’s, 


2. Prove 


= 1 1 = 
AV 1—1) <1 +— +— +: te << 2V 0. 
es ) V2 V3 Vn 
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3. Prove for z,y >0 


2 2 
Interpret this result geometrically in terms of the graph of z”. 
4. Ifa, >a, >'t: 2a, and b, > b} 2+: > dy, prove 


n> a,b; 2 (Ža) (3, l 


5. (a) Show that the sequence ay, ag, dg, . . . can be written as the sequence 
of partial sums of the series u4, Ug, Uz, ... where u, = a, — ap forn > 1 
and u; = q}. 

(b) Write the sequence a, = n° as the sequence of partial sums of a series. 

(c) From the result obtain a formula for the nth partial sum of the series 


1+449+4+---+2+--- 


(d) From the formula for 1? + 2? + --- + n°, find a formula for 
127 + 37 + 5S? + 4+ (2n + 1). 


x" + y” xt+yV\ 
> ( “y. 


6. A sequence is called an arithmetic progression of the first order if the 
differences of successive terms are constant. It is called an arithmetic 
progression of the second order if the differences of successive terms form 
an arithmetic progression of the first order; and, in general, it is called an 
arithmetic progression of order k if the differences of successive terms form 
an arithmetic progression of order (k — 1). 

The numbers 4, 6, 13, 27, 50, 84 are the first six terms of an arithmetic 
progression. What is its least possible order? What is the eighth term of 
the progression of smallest order with these initial terms? 


7. Prove that the nth term of an arithmetic progression of the second 
order can be written in the form an? + bn + c, where a, b, c are independent 
of n. 


*8. Prove that the nth term of an arithmetic progression of order k can 
be written in the form an* + bn*-1 +.--- + pn +q, where a, b,..., p,q 
are independent of n. 

Find the nth term of the progression of smallest order in Problem 6. 

9. Find a formula for the nth term of the arithmetic progressions of 
smallest order for which the following are the initial terms: 

(a) 1, 2,4, 7, 11, 16,.... 

(b) —7, —10, —9, 1, 25, 68,.... 

*10. Show that the sum of the first n terms of an arithmetic progression 
of order k is 

aSk + Qy_ySp_y + °° + aS, + aon, 
where S, represents the sum of the first nvth powers and the a, are independent 
of n. Use this result to evaluate the sums for the arithmetic progressions of 
Problem 9. 


11. By summing 
wy +10 +2)°°--@ +k +1) —-@~-Ipo +1) O +h) 
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from v = 1 to v = n, show that 


z 7 _—nn+1)--:(1n+k +1) 
ae +1» +2):--@ +k) ee e A 


12. Evaluate 1° + 2? +--+ + n? by using the relation 
v3 = o(v + 1) + 2) — 3x(v + 1) +», 
13. Show that the function 


x #0 
fia) = Pea 
0, x =0 


is continuous but not Holder-continuous. (Hint: Show Holder continuity 
with exponent « fails at the origin by considering the values x = 1/2”/+,) 


14. Let a, be a monotone decreasing sequence of nonnegative numbers. 
W (es 
Show that > a, converges if and only if È 2”agy does. 
n=l v=0 
15. Investigate for convergence and determine the limit when possible, 


(a) nie — [n!e] 
(b) anlan, where a, = 0, ag = 1, and ak, = p41 + Ap- 


2 


The Fundamental Ideas of the 
Integral and Differential Calculus 


The fundamental limiting processes of calculus are integration and 
differentiation. Isolated instances of these processes of calculus were 
considered even in antiquity (culminating in the work of Archimedes), 
and with increasing frequency in the sixteenth and seventeenth centuries. 
However, the systematic development of calculus, started only in the 
seventeenth century, is usually credited to the two great pioneers of 
science, Newton and Leibnitz. The key to this systematic development 
is the insight that the two processes of differentiation and integration, 
which had been treated separately, are intimately related by being 
reciprocal to each other.! 

A fair historical assessment of the merits cannot attribute the 
invention of calculus to sudden unexplainable flashes of genius on the 
part of one or two individuals. Many people, such as Fermat, Galileo, 
and Kepler, stimulated by the revolutionary new ideas in science, 
contributed to the foundations of calculus. In fact, Newton’s teacher, 
Barrow, was almost in full possession of the basic insight into the 
reciprocity between differentiation and integration, the cornerstone of 
the systematic calculus of Newton and Leibnitz. Newton has stated 
the concepts somewhat more clearly; on the other hand, Leibnitz’s 
ingenious notation and methods of calculation are highly suggestive 
and remain indispensable. The work of these two men immediately 
stimulated the higher branches of analysis including the calculus of 
variations and the theory of differential equations, and led to innumer- 
able applications in science. Curiously enough, although Newton, 


1 This fact constitutes the ‘‘fundamental theorem of calculus.”’ 
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Leibnitz, and their immediate successors made such varied uses of the 
powerful tool put into their hands, none succeeded in completely 
clarifying the basic concepts involved in their work. Their arguments 
employed “infinitely small quantities’ in ways which are logically 
indefensible and unconvincing. Clarification came at last in the nine- 
teenth century with the careful formulation of the concept of limit and 
with the analysis of the number continuum as explained in Chapter 1. 

We begin with a discussion of the fundamental concepts. They can 
be fully appreciated only through concrete illustrations and examples; 
it is therefore recommended here, as at many places in this book, that 
theoretical and general sections be carefully studied again after the 
reader has absorbed more specific and concrete material in subsequent 
sections. 


2.1 The Integral 
a. Introduction 


Only after a lengthly development the systematic procedures of 
integration and differentiation met the need for precise mathematical 
descriptions of intuitive notions arising in geometry and natural 
science. Differentiation is the concept needed for describing the notions 
of tangents to curves and of velocity of moving particles, or more 
generally, the concept of rate of change. The intuitive concept of area 
of a region with curved boundaries, finds its precise mathematical 
formulation in the process of integration. Many other related concepts 
in geometry and physics also require integration, as we shall see later. 
In this section we introduce the concept of integral, in connection with 
the problem of measuring the area of a plane region bounded by curves. 


Areas. We have an intuitive feeling that a region contained in a 
closed curve has an “area”? which measures the number of square 
units inside the curve. Yet, the question, of how this measure for the 
area can be described in precise terms, necessitates a chain of mathe- 
matical steps. The basic properties of area which intuition suggests 
are: area is a (positive) number (depending on the choice of the unit 
of length); this number is the same for congruent figures; for all 


1 The emergence of calculus extending over more than 2000 years represents one of 
the most fascinating chapters in the history of scientific discovery. Interested 
readers are referred to Carl B. Boyer, Concept of the Calculus, Hafner Publishing 
Company, 1949. See also O. Toeplitz, Calculus, A Genetic Approach, University of 
Chicago, 1963. 
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rectangles it is the product of the lengths of two adjacent sides; and 
finally, for a region decomposed into parts, the area of the whole is 
equal to the sum of the areas of the parts. 

An immediate consequence is the fact: for a region A which ts part 
of a region B, the area of A cannot exceed the area of B. 

These properties permit the direct computation of the area of any 
figure that can be decomposed into a finite number of rectangles. 
More generally, to assign a value F to the area of a region R we consider 
two other regions R’ (inscribed) and R” (circumscribed) decomposable 


Figure 2.1 Approximation of an area. 


into rectangles, where R” contains R and R’ is contained in R (cf. 
Fig. 2.1). We know then at least that F has to lie between the areas of 
R’ and R”. The value of Fis completely determined if we find sequences 
of circumscribed regions R,,” and inscribed regions R,’ which are both 
decomposable into rectangles and such that the areas of R,” and 
R, have the same limit as n tends to infinity. This is the method of 
“exhaustion”, going back to antiquity which is used in elementary 
geometry to describe the area of a circle.! The precise formulation of 
this intuitive idea now leads to the notion of integration. 


b. The Integral as an Area 


Area under a Curve 


The analytic notion of integral arises when we associate areas with 
functions: We consider the area of a region bounded on the left and 


1 Of course, we may use any kind of inscribed and circumscribed polygon, since a 
polygon can be decomposed into right triangles and the area of a right triangle 
clearly is half that of a rectangle with the same sides. 
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right by vertical lines x = a and x = b, below by the z-axis and above 
by the graph of a positive continuous function f(x) (Fig. 2.2). This is 
referred to in brief as the area “under the curve.” For the moment we 
accept as intuitive the idea that the area of such a region is a definite 
number. We call this area F,’ the integral of the function f between the 


Figure 2.2 


limits’ a and b. In seeking the numerical value of F,? we make use of 
approximations by sums of areas of rectangles. For that purpose we 
divide the interval (a, b) of the x-axis into n (small) parts, not necessarily 
of the same size, which we shall call cells. At each point of division we 
draw the line perpendicular to the x-axis up to the curve. The region 
with area F, is thus divided into n strips, each bounded by a portion of 


Figure 2.3 


the graph of the function f(x) and by three straight line segments 
(Fig. 2.3). 

Area or Integral as Limit of a Sum. Calculating the area of such 
strips precisely is not easier than calculating that of the original region. 
It is a step forward, however, to approximate the area of each strip 
from above and from belcw by the areas of the circumscribed and 
1 No confusion should arise from the use of the word “limit” for boundary points of 
the interval of integration. 
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inscribed rectangles with the same base, where the curved boundary of 
the strip is replaced by a horizontal line at a distance from the x-axis 
which is either the greatest or the smallest value of f(x) in the cell 
(Fig. 2.4). More generally, we obtain an intermediate approximation 
if we replace the strip by a rectangle of the same base and bounded on 


y 


4 j LAS 


Figure 2.4 


top by any horizontal line which intersects the curved boundary of the 
strip (see Fig. 2.5). Analytically, this amounts to replacing the function 
f(x) in each of the cells by some intermediate constant value. We 
denote by F, the sum of the n rectangular areas. Intuition tells us that 
the values F, tend to F,’ if we make the subdivision finer and finer, that 
is, if we let n increase without limit while the largest length of the 


04 / 

crt ity 
Fa foe ALY thf, 
YY YH, 


/ 
ra 
/, 


Figure 2.5 


individual cells tends to zero. In this way F,’ is represented as a limit of 
areas consisting of rectangles. 


c. Analytic Definition of the Integral. Notations 


Definition and Existence of Integrals 


In the last paragraph we accepted the area under a curve as a quantity 
given intuitively and subsequently we represented it as a limiting 
value. Now we shall reverse the procedure. We no longer invoke 
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intuition to assign an area to the region under a continuous curve; 
on the contrary, we shall begin in a purely analytic way with the sums 
F,, defined previously, and we shall prove that these sums tend to a 
definite limit. This limit is then the precise definition of the integral 
and of the area. 

Let the function f(x) be continuous (but not necessarily positive) in 
the closed interval a < x < b. We divide the interval by (n — 1) 


| 


Q Q 
Xn-1 n D=Xp 


y 


O ----———~~—-= 


Figure 2.6 To illustrate the analytical definition of integral. 


points 2, %,...,2,_, into n equal or unequal cells with the lengths 


n—1 
vi — f; = Az,, (i = PEATS 3 
where in addition we put x) = a, x, = b (cf. Fig. 2.6). In each closed 


subinterval [z;_,,2,] or cell we choose any point &, whatever. We 
form the sum 


Pa = f(E B To) + SENT a xı) au a +SEE, B Ln-1) 
= f (E1) Ax, + f(&) Ave + +++ HS En) Ax, 


1 The symbol A must not be interpreted as a factor but only as indicating a difference 
in values of the variable which follows. Thus the symbol Az; means the difference 
x; — x;_, of consecutive values of z. 
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Using the summation symbol we write more concisely 


Nn 


Fp, = > SENE: — Zi) 


i=1 
or ji 
Fa =2 SED Ax,. 
= 

If f(x) is positive, the value F,, represents the area under the curve 
obtained by replacing f in each subinterval by the constant value 
f(E). Of course, the sums F, can be formed without assuming f to be 
positive. It appears intuitively plausible that the sums F, must tend to 
a limit F,’ as the number n of intervals increases indefinitely and at the 
same time the length of the largest subinterval tends to zero. This 
would imply that the value of the limit F,’ is independent of the 
particular manner in which the points of division 2), 72,..., Z,_, and 
the intermediate points &,,&,,...,&, are chosen. We call F,” the 
integral of f(x) between the limits a and b. 

Geometric intuition, no matter how convincing, can only serve as a 
guide to our analytical limiting process; therefore an analytic 
justification is needed, and we must furnish a proof for the existence 
of the integral as the limit described above. Furthermore, as already 
said, we need not at all insist on the assumption that the function f is 
positive in the interval. 

Thus we assert 


THEOREM OF EXISTENCE. For any continuous function f(x) in a closed 
interval [a, b] the integral over this interval exists as the limit of the 
sums F,, described above (independently of the choice of the points of 
subdivision x,,...,2,-, and of the intermediate points &,,...,&, as 
long as the largest of the lengths Ax, tends to zero). 


We shall first gain some experience and insight before considering 
the existence proof for the integral in the Supplement (p. 192). 


Leibnitz’s Notation for the Integral 


The definition of the integral as the limit of a sum led Leibnitz to 
express the integral by the following symbol: 


(‘ie de 


The integral sign is a modification of the summation sign in the shape 
of a long S used at Leibnitz’s time. The passage to the limit from a 
finite subdivision into portions Az, is indicated by the use of the letter 
d in place of A. In using this notation, however, we must not tolerate 
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the eighteenth century mysticism of considering dx as an “‘infinitely 
small” or “infinitesimal quantity,” or considering the integral as a “sum 
of an infinite number of infinitely small quantities.” Such a conception 
is devoid of clear meaning and obscures what we have previously formu- 
lated with precision. From our present viewpoint the individual symbol 
dx has not been defined at all. The suggestive combination of symbols 


b 
[70 dx is defined for a function f(x) in the interval [a, b] by forming 


the ordinary sums F, and passing to the limit as n > œ. 
The particular symbol we use for the variable of integration is a 
matter of complete indifference (just as in the notation for sums it 


Figure 2.7 


did not matter what we called the index of summation); instead of 


KO dt or | Tode The 


el 


integrand denoted by fis a function of an independent variable over the 
interval [a, b] and the name of the variable is irrelevant. Only the end 
points of the interval of integration a and b affect the value of the 


b 
Í f(a) da in which 


ea 


the same letter is used for the variable of integration and an end-point 
of the interval are misleading under our definition and should, at first, 
be avoided. 

If the integrand f(x) is positive in the interval [a,b], we can 


b 
f f(x)dx we can equally well write 


integral for given f. Expressions like [re dx or 


b 
immediately identify | f(x) dx with the area bounded by the graph of f 


and the lines x = a, x = b, and y = 0. The integral of f, however, is 
defined analytically as the limit of sums F, independent of any assump- 
tion on the sign of f. If f(x) is negative in all or part of our interval, the 
only effect is to make the corresponding factors f(&,) in our sum 
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Figure 2.8 


negative instead of positive. To the region bounded by the part of the 
curve below the z-axis we shall then naturally assign a negative area. 
The integral will thus be the sum of positive and negative terms, 
corresponding respectively to portions of the curve above and below the 
x-axis! (see Fig. 2.7). 

It is intuitively convincing that our limit process converges even if 
the function f(x) is not everywhere continuous, but has jump discon- 
tinuities at one or several points like the function indicated by the curve 
in Fig. 2.8, where clearly an area under the curve exists.” 


y 


1 
Figure 2.9 Í sgn x dr = 0. 
—1 


1 Areas of regions bounded by arbitrary closed curves will be considered in Chapter 4. 
2 As another example consider f(x) = sgn x on [—1, 1]. We have f(x) = —1 for 
+1 


x < 0 and f(x) = +1 for x > 0 (see Fig. 2.9). Then | f(x) dx = 0. 
=I 
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Thus the preceding limit process may well result in a definite limit 
of the sum F, for functions having some discontinuities ; we indicate this 
possibility by calling such functions integrable. In the middle of the 
nineteenth century, the great Bernhard Riemann first analyzed the 
applicability of the process of integration to general functions. More 
recently, various extensions of the concept of integration itself have 
been introduced. Yet such refinements have less immediate importance 
for the calculus aimed at intuitively accessible phenomena, and it 
will not be necessary for us always to emphasize the integrability of 
our functions as a reminder that nonintegrable functions can be 
defined. 

In advanced courses the integral we have defined here is called the 
Riemann integral to distinguish it from various generalized concepts 
of integral; the approximating sums F, are called Riemann sums. 


2.2 Elementary Examples of Integration 


In a number of significant cases we are now able to calculate the 
integral of a function by carrying out the prescribed limiting process. 
This we shall do by an explicit evaluation of the sums F, for a suitable 
choice of intermediate points å, (usually the left or right end point of 
the cells). The theorem on the existence of the integral of a continuous 
function assures that the limit of the F, is the same for any other 
choice of the intermediate points €, and for any method of subdivision. 


a. Integration of a Linear Function 


First we verify that the integral indeed gives the correct value of the 
area for some simple figures we know from geometry. 

Let f(x) = constant = y. To calculate the integral of f(x) between 
the limits of a and b we form the sums F, (see Fig. 2.10). Since here 


f(E) = y, we find 


F, = > y Az, = y > Ax; = y(b — a). 
i=l 


i=1 


Hence, likewise 


b 
lim F, -Í y dx = y(b — a). 


n> 0 a 


This is just the formula for the area of a rectangle of height y and base 
b—a. 
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f(x) 


Figure 2.10 Integral of a constant. 


The integral of the function f(x) = zx, 


b 
| zaz, 
a 


(Fig. 2.11), as we know from elementary geometry, has the value 
(b — a)(b + a) = (b — a’). 


To confirm that our limiting process leads analytically to the same 
result, we subdivide the interval from a to b into n equal parts by means 
of the points of division 


ath,at+2h,...,a+(n— Dh, 


y 


Figure 2.11 
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where h = (b — a)/n. Taking for &, the right-hand end point of each 
interval we find the integral as the limit as n — œ of the sum 


F, =(a+Ah+ (a+ 2AA4+°°':+ (24+ nhh 
=nah+(1+24+3+:°-:++n)h? = nah + n(n + DA, 


where we have used the well-known formula for the sum of an arith- 
metic progression (see p. 111, Problem 3). Substituting A = (b — a)/n, 
we see that 


F, = a(b — a) +41 + t) — a 
2 n 
from which it follows immediately that 


lim F,, = a(b — a) + 3(b — af = 3(b® — a°). 


nO 


b. Integration of x 


Elementary geometry does not so easily lead to the integration of the 
function f(x) = 2*, that is, to the determination of the area of the 
region! bounded by a segment of a parabola, a segment of the z-axis, 
and two coordinates. A genuine limit process is needed. Assuming 
a < b we choose the same points of division and the same intermediate 
points as in the previous example (see Fig. 2.12). It follows then that 
the integral of x? between the limits a and b is the limit of the sums 


F, = (a + hh + (a + 2h?h +++ + (a + nhh 
= nah + 2ak(l +2 +3+:- +n) 
+ AR(12 + 2? + 324 --+ n); 


by using the known values of the sums enclosed in parentheses we 
find (see p. 58) 


F, = na*h + n(n + 1)ah? + - [n(n + 1)(2n + 1)]h° 
bea ( zi 1) a(b =a a H(i + *)(2 4+ =\(b — ay 
n 6 n n 
Since lim : = 0, we have 


n—» © 


lim F, = a?(b — a) + a(b — a)? + (b ~a} = TG — aî). 


n= 


1 Sometimes referred to as “squaring” the region. 


Sec. 2.2 Elementary Examples of Integration 131 


Figure 2.12 Area under a parabolic arc by arithmetic subdivision. 
Thus, fora < b, 
b 
1 
Í x? dx = - (b? — a°). 
a 3 


*c. Integration of x* for Integers « Æ — | 


The next examples of this section are instructive illustrations showing 
that in some cases the integration can be carried out by special ele- 
mentary devices. Later in Section 2.9d (p. 191) we shall achieve the 
same results more simply by using general methods. 

The same kind of argument as used for x and z?, applied to the 


functions z’, x4,..., results in the relation 
(1) ie a (b+! — a***) 
a a + 1 
where « is any positive integer; this can be proved by finding appro- 
priate formulas for the sums 1* + 2* + -++ + n*, such as the relation 


l l 
lim [artt Da = 


which can be proved by induction over « (see Problem 16, p. 113). 
In the following section, formula (1) will be proved in a different way, 
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with greater generality and simplicity, indicating the power of the 
methods that we will develop. Its validity will be extended to all real 
values of « except « = —1. 

Fortunately, the definition of the integral leaves us a great deal of 
latitude in the choice of subdivisions and furnishes a much simpler way 
to evaluate the integral. We do not have to use sums based on equi- 


distant points of division. Instead, with the “quotient” 1/b/a = q we 


y 


Figure 2.13 Area under a parabolic arc by geometric subdivision. 


subdivide the interval [a, b] by the points of a geometric progression 
(Fig. 2.13), 
a, aq, ag?, . . . , ag" ', aq” = b; 


we then need only to evaluate the sum of a geometric series. Given the 
points of division x, = ag’ the length of the ith cell is given by 
_ 2q'(q — 1) 

q 


Az, = aq’ — aq’ 
The largest Az, is the last: 
pee A L A 
q 


For n — œ the number q tends toward the value one (see Example d, 
p. 64), and hence the length Az,, of the largest cell, and then also the 
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lengths of all cells tend to zero. For the intermediate points ¢, we choose 
again the right-hand end points x, of each cell. The sum 


(2) Fn = F (E) Az, = X (aq'fag'? - 


pan quit —- ; Sa sie i 


is known explicitly from the sum of the geometric progression with 
ratio q1". Applying the well-known formula (p. 67), we find 


F, = anl gn i = 1 
i q gna 
=o a*t? a (bJa) + a+l a a*t! a 4 — 1 
a (q — Iq" gat} — = (6 )q qth — 1 i 


Since q # 1, we can use once more the formula for the sum of a geo- 
metric progression and write 

E ia eee ee. E 

g] qg +q +: +1 


For n — œ all powers of q tend to one and it follows that 


lim F, = —— (b'** — a't’). 
n> 1 +a 
In this way we have verified the formula (1) for the integral of x* for 
0 <a < band any positive integer a. 
The same method applies also for negative integers «, provided that 
a Æ —1. For the sum F, we obtain as before 
a atl _ q+! a D | 
fe OO ae, 
q-—1 
= (pet ae) are 
qd —q*") 
where we recall that —« is positive and greater than one. Applying the 
formula for a geometric progression, we obtain 


H=) ee l 
q | ae rey | ae + , 4+ as 4 + q 
which tends to 1/(—a« — 1) as n — œ. Consequently, as before, 
lim F, = —— (b — a**4), 
+ 


n> W 1 
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The integral formula is meaningless for « = —1, since both numer- 
ator and denominator on the right-hand side would then be zero. 
We find instead from our original expression (2) for F, for the case 
a= —l that F, = n(q — 1)/q. Consequently, observing that q = 


W bla tends to one as n — œ, we find 


b 
(3) Í tesim e 1): 

a & n> co 
Here the limit on the right-hand side cannot be expressed in terms of 
powers of a and b but can be expressed in terms of logarithms of those 
quantities as we shall see later (p. 145). 


*d. Integration of x* for Rational « Other Than —| 


The result obtained previously may be generalized considerably 
without essentially complicating the proof. Let « = r/s be a positive 
rational number, r and s being positive integers: then in the evaluation 
of the integral given above nothing is changed except the evaluation of 
the limit (q — 1)/(q**1 — 1) as q approaches one. This expression is 
now simply (q — 1)/(q¢'"**)/* — 1). Let us put g!/* = 7 (7 # 1): Then 
as q tends to one, 7 also tends to one. We have therefore to find the 
limiting value of (r° — 1)/(77'* — 1) as 7 approaches one. If we 
divide both numerator and denominator by r — 1 and transform them 
as before by the formula for geometric progressions, the limit simply 
becomes 

i EE | 
iit = 
Sirel sE qrts-2 + n 1 


Since both numerator and denominator are continuous in 7, this limit 
is at once obtained by substituting 7 = 1, and thus equals s/(r + s) = 
l/(x + 1); hence for every positive rational value of « we obtain the 
integral formula 

I 


pui gy: 
a + il 


b 
x dx = 
ea 


just as with positive integers. 
This formula remains valid for negative rational values of « = 


—r/s as well, provided we exclude the value « = —1 (for which the. 
formula used above for the sum of the geometric progression loses its 
meaning). 


For negative « we again evaluate the limit of (g — 1)/(q**? — 1) by 
putting q7! = 7 for « = —r/s; this is left as an exercise for the reader. 
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It is natural to guess that the range of validity of our last formula extends 
also to irratjonal values of «. We shall actually establish our integral formula 
for all real values of « (except « = —1) in Section 2.7 (p. 154) in a quite 
simple way as a consequence of the general theory. 


*e. Integration of sin x and cos x 


The last elementary example to be treated here by means of a special device 
is the integral of f(x) = sin x. The integral 


b 
f sin x dx 
clearly is the limit of the sum | 
S, = h[sin (a + h) + sin (a + 2h) +--+: + sin (a + nh)), 


arising from division of the interval of integration into cells of size h = 
(b — a)/n. We multiply the right-hand expression by 2 sin A/2 and recall the 
well-known trigonometrical formula 


2 sin u sin v = cos (u — v) — cos (u + v). 


Provided A is not a multiple of 27, we obtain the formula 
h h 3 3 
Sp, = , [cos (a +£) -cos (a +34) + eos (a + 52) 
2 
5 2n — 1 2n +1 
— cos (< +34) +> +cos («+ 5 i) — cos («+ 5 a) | 


h h mE 
= 7, | cos ars — cos ja + 5 ‘ 


2 sin = 


Since a + nh = b, the integral becomes the limit of 
h h h 
|e f + 5) — cos (» + 5) | ash — 0. 
2 sin 5 


Now we know from Chapter | (p. 84) that for h — 0, the expression 
(h/2)/(sin h/2) approaches the limit one. The desired limit is then simply 
cos a — cos b, and we arrive at the integra] 


b 
Í sina dx = —(cos b — cos a). 
Similarly, 
b 
Í cos x dx = sin b — sina (see Problem 3, p. 196). 
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Each of the preceding examples was treated with a special device. 
Yet the essential point of the systematic integral and differential 
calculus is the very fact that, instead of such special devices, we use 
general considerations which lead directly to the result. We shall 
arrive at these methods by first discussing some general rules concerning 
integrals and then introducing the concept of the derivative, and finally 
establishing the connection between integral and derivative. 


2.3 Fundamental Rules of Integration 


The basic properties of the integral follow directly from its definition 
as the limit of a sum: 


IEG dx =n 2 $ FE) Ae, 


where the interval [a, b] is broken up into subintervals or cells of 
length Ax,;, the number ¢, stands for any value in the ith subinterval, 
and the largest Ax, is required to tend to zero for n + œ. 


a. Additivity 


Let c be any value between a and b. If we interpret integrals as areas 
and remember that the area of a region consisting of several parts is 
the sum of the areas of the parts (Fig. 2.14), we are led to the rule 


(4) f(x) dx = i i f(x) dx + Í ' f(x) de. 


For an analytical proof we choose our subdivisions in such a manner 
that the point c appears as a point of division, say € = £, (where m 
varies with n). Then 


È Ede = ESEA $ SE) Ae, 


where the first sum on the right-hand side corresponds to a subdivision 
of the interval [a, c] in m cells and the second sum to a subdivision of the 
interval [c, b]. Now for n — œ we obtain our rule for integrals. 


So far we have only defined f f(x)dx whena <b. Fora=b or 


a > b we define the integral in such a way that the rule of additivity is 
preserved. Therefore for c = a we must define 


(5) IEG dx = 0, 
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and then for b = a it follows that 
Í Í f(x) dx +] f(x) dx -Í f(x) dx = 0. 
This leads us to define Í ` f(£) dæ for c < a by the formula 


(6) [1 ae = -f fæ) de, 


where the right side has the meaning originally established. Its geo- 
metric meaning is that the area under the curve y = f(x) is to be counted 


y 


Figure 2.14 


as negative if the direction of moving from the lower limit of integration 
to the upper limit is that of decreasing x. A glance at the previous exam- 
ples of integrals confirms that indeed an interchange in the limits of inte- 
gration a and b results in changing the sign in the value of the integral. 


b. Integral of a Sum and of a Product with a Constant 


If f(x) and g(x) are any two (integrable) functions, the basic laws of 
Operating with limits imply 


[ ferae + | gide =tim| S sede] + him | È gE) de 


= lim 5 f(E) Az, + 5 g(§;) Ax, 
i=1 i=l 


n> O 


= lim [$ E) + g(€)142.); 
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and hence the important rule for the sum of two functions 


(7) f Ja) dz + | g(a) de = Í EE 


similarly for the difference 


fræ dx =| eo dx TO sAd 


Furthermore, with any constant a 


b n 
| af(x) dx = lim > af(é,) Az, 
a nto i=l 


=alim > f(é,) Az,, 


and so iad 
(8) [ «re dx = a| f(a) dx. 


The last two rules enable us to integrate “linear combinations” of 
two or more functions that can be integrated individually. Thus for 
any quadratic function y = Ax? + Bx + C with any constants A, B, C, 
we have 


b b b b 
fatt Bet ar =| ardea] Brde+| Cdx 


b b b 
= A| atde+ Bl 2dr + Cf ide 


a a a 


= 2 (8 — a) +6 — a°) + C(b — a). 


In the same way we integrate the general polynomial 
y = Ax” + Ayx"* + ip A,_\% + Ay 
b 
Í y dx > E Ab = a") 4 Lacy = a”) + ae 
n+l n 


+ ZA „(b m a”) T A,(b z a). 
c. Estimating Integrals 


Another obvious observation concerning integrals is basic. Consider 
fora < b a function f(x) which is positive or zero at each point of the 
interval [a, b]. Then 


(9 | sa) dr > 0 


Sec. 2.3 Fundamental Rules of Integration 139 


This follows immediately if we write the integral as limit of a sum and 
notice that the sums contain only nonnegative terms. 

More generally, if we have two functions f and g with the property 
that f(x) > g(x) for all x in the interval [a, b], then 


(10) [sae >| a) ar 


For we have 


[1 dx =| 'ee) dx =| y — 9(x)] dx > 0, 


since f(x) — g(x) is never negative. 

We apply this result to a function f(x) which is continuous in the 
interval [a, b]. Let M be the greatest value and m the least value of f 
in that interval. Since 

m<f@<M 
for all x in [a, b], we have 


b b b 
Í m dx <| sŒ dx <| M dx. 
Recalling that for any constant C 


b b 
f caz = cf'i de = cœ- a) 


a 


we obtain the inequality 
b 
(11) m(b — a) < | f(a) dz < M(b — a), 


which gives simple upper and lower bounds for the definite integral of 
any continuous function. 

Again this estimate is intuitively obvious. If we think of the integral 
interpreted as an area, the quantities M(b — a) and m(b — a) represent 
areas of a circumscribed and an inscribed rectangle on the common 
base of length b — a (see Fig. 2.15). 


d. The Mean Value Theorem for Integrals 


Integral as a Mean Value 


Significant is a slightly different interpretation of our inequalities 
in terms of the average of the function f in an interval [a,b]. For a 
finite number of quantities fi, f2, - - - , Jn the average or arithmetic mean 
is the number 

ftf: tfa 


n 
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Figure 2.15 


Figure 2.16 The mean value x of a function. 


Ch. 2 
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If we want to assign a meaning to the average value of the infinitely 
many quantities f(x) corresponding to arbitrary xz in the interval 
[a, b), it is natural to pick out first a finite number n of values of f, 
say f (x1), f(%2),..-,f(2,), to form their average 


f(t) +`: +S (4n) 
n 


and then to take the limit as n increases beyond all bounds. The value 
of this limit, if it exists at all, will depend very much on how the points 
x, are spaced in the interval [a, b]. A definite value for the average of f 
is attained if we take for the x, the points obtained when we divide the 
interval [a, b] into n equal parts of length Ar, = (b — a)/n. We have 
then 


F(a) + +S (Fn) O Ey Fy Ae. 


n a 


and it is clear that in the limit for n — oo the nth averages converge 
towards the value 


f fade 


[az 


We shall call u the “arithmetic average” or the mean value of f in the 
interval [a, b]. Our inequalities then simply state that the mean value 
of a continuous function cannot be larger than the greatest value or 
less than the least value of the function (Fig. 2.16). 

Since the function f(x) is continuous in the interval [a, b], there 
must be points in the interval where f has the value M or the value m. 
By the intermediate value theorem for continuous functions there must 
then also be a point & in the interval where f actually assumes the 
intermediate value u. We have proved then: 


w= [fdr = 


MEAN VALUE THEOREM. For a continuous function f(x) in the 
interval [a, b] there exists a value & in the interval such that 


(12) IEG dx = f(&)(b — a). 


This is the simple but very important mean value theorem of integral 
calculus. In words, it states that the mean value of a continuous 
function in an interval belongs to the range of the function. 

The theorem asserts only the existence of at least one & in the interval 
for which /(&) is equal to the average value of f but gives no further 
information about the location of &. 
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Note that the formula expressing the mean value theorem stays 
valid if the limits a and b are interchanged; hence the mean value 
theorem is correct also when a > b. 


The Generalized Mean Value Theorem. instead of the simple arithmetic 
average we often have to consider “weighted averages” of n quantities 


fis» +s fn given by 
Pah + Pofe ts’ + Prfn z 
Pit pots +P 


where the “weight factors” p; are any positive quantities. If, for example, 
Pis Pè - » - Pn are actually the weights of particles located respectively at the 
points fis, fo,.--.fn Of the x-axis, then » will represent the location of 
the center of gravity. If all weights p; are equal, the quantity is just the 
arithmetic average defined above. 

For a function f(x) we can form analogously the weighted average 


’ 


b 
Í f(x)p() dx 


| aa T 
Í pe) du 


l 


(13) 


over the interval [a, b] where p(x), the weight function, is any positive function 
in the interval. The assumption that p is positive guarantees that the 
denominator does not vanish. 


The weighted average ıı also lies between the largest value M and the smallest 
value m of the function f in the interval. 


For multiplying the inequality 
m < f(x) <M, 
by the positive number p(x), we find that 
mp) < fpe) < Mpe). 


Integration then yields 
b b v 
m| p(x) dx <| f(@)p(z) dx < M f ple) de. 


b 
Dividing by the positive quantity | p(x) dx, we indeed obtain the result 


m<e<M. 


If here f(x) is continuous, we conclude from the intermediate value 
theorem (p. 44) that u = f(&), where & is a suitable value in the interval 
a <ë <b. This leads to the following generalized mean value theorem of 
integral calculus: 
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If f(x) and p(x) are continuous in the interval [a, b] and moreover p(x) is 
positive in that interval, then there exists a value £ in the interval such that 


b b 
(14) l f (x)p(x) dx = f of p(x) dx. 


The special case p(x) = 1 leads to our earlier mean value theorem. 


2.4 The Integral as a Function of 
the Upper Limit (Indefinite Integral) 


Definition and Basic Formula 


The value of the integral of a function f(x) depends on the limits of 
integration a and b: The integral is a function of the two limits a and b. 
In order to study this dependence on the limits more closely we imagine 
the lower limit to be a fixed number, say a, denote the variable of 
integration no longer by x but by u (see p. 126), and denote the upper 
limit by x instead of by b in order to indicate that we shall consider the 
upper limit as the variable and that we wish to investigate the value of 
the integral as a function of this upper limit. Accordingly, we write 


(2) = | fw du. 


We call the function ¢(x) an indefinite integral of the function f(x). 
When we speak of an and not of the indefinite integral, we suggest that 
instead of the lower limit « any other could be chosen, in which case we 
should ordinarily obtain a different value for the integral. Geometri- 
cally, the indefinite integral ¢(x) is given by the area (shown by shading 
in Fig. 2.17) under the curve y = f(u) and bounded by the w-axis, 
the ordinate u = « and the variable ordinate u = 2, the sign being 
determined by the rules discussed earlier (p. 126). 

Any particular definite integral is found from the indefinite integral 
¢(x). Indeed, by our basic rules for integrals, 


fu) du =} fw du +] fw du 
= = | fw du +f fw du = $(b) — f(a). 


In particular, we can express any other indefinite integral with a lower 
limit «’ in terms of $(2): 


[100 du = ga) — ge) 
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y 


Figure 2.17 The indefinite integral as an area. 


As we see, any indefinite integral differs from the special indefinite 
integral d(x) only by a constant. 


Continuity of the Indefinite Integral 


If the function f(x) is continuous in the interval [a, b] and « is a 
point of that interval, then the indefinite integral 


is Í f(u) du 


represents a function of x which is again defined in the same interval. 
As easily seen: The indefinite integral $(x) of a continuous function 
F(x) is likewise continuous. For if x and y are any two values in the 
interval we have by the mean value theorem that 


(15) HY) — Ax) =| "fu du = f(é)(y — 2) 


where & is some value in the interval with end points x and y. From the 
continuity of f we have then 


lim $(y) = lim [A(x) + f(y — x)] = $x) + f(x) 0 = O(2), 


yru Yu 


which shows that @ is contmuous. More specifically, in any closed 
interval we have |d(y) — ¢(z)| < M |y — x|, where M is the maximum 
of |f|in the interval, so that ¢ is even Lipschitz-continuous. 
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Formula (15) for ¢(y) — $(x) shows: that ¢(z) is an increasing func- 
tion of tin case f is positive throughout the interval, namely, for y > x 


Py) = $x) + f(y — x) > A(z). 


Forming the indefinite integral of a function is an important way of 
generating new types of functions. In Section 2.5 we shall apply this 
method to introduce the logarithm function. This will also give us a 
first glimpse of the fact that general theorems of mathematical analysis 
lead to the most remarkable specific formulas. 


As we shall see in Section 3.14a (p. 298), the definition of new 
functions by means of integrals of already defined functions is a 
satisfactory procedure if we wish to put definitions (for example, of the 
trigonometric functions) on a purely analytical basis instead of relying 
on intuitive geometrical explanations. 


2.5 Logarithm Defined by an Integral 


a. Definition of the Logarithm Function 
b 
In Section 2.2 we had succeeded in expressing Í x? dx for any rational 


a Æ —l in terms of powers of a and b. For « = —1 we were only 
able to represent the integral as limit of a sequence 


b a 

f ur = lim n( V bja — 1). 
a U n> © 

Independently of the discussions of Section 2.2 we now introduce the 

function represented by the indefinite integral 


fE au! 
ı u 


or, geometrically, by the area under a hyperbola as indicated in Fig. 
2.18. We call it the /ogarithm of x, or more accurately the natural 
logarithm of x, and write 


(16) ogs =f Tdi 
1 u 
Since y = l/u is a continuous and positive function for all u > 0, 
the function log = is defined for all x > 0, is moreover continuous, and 
also is monotonically increasing. The choice of | as the lower limit in 


1 In this section we again freely use the fact that the integral of a continuous function 
(here the function 1/u) exists; the general proof is given in the Supplement. 
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y 


Figure 2.18 Log x represented by an area. 


the indefinite integral for log x is a matter of convenience. It implies that 
(17) log 1 = 0, 
and that log x is positive for x > | and negative for x between zero and 


] (Fig. 2.19). Any definite integral of 1/u between positive limits a 
and b can be expressed in terms of logarithms by the formula (see p. 143) 


b 
(18) | Adio lore: 


u 


Figure 2.19 The natural logarithm. 
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Geometrically, this integral represents the area under the hyperbola 
y = l{x between the ordinates x = a and x = b. 

b. The Addition Theorem for Logarithms 


The fundamental property which justifies the traditional name for 
log x is expressed by the 


ADDITION THEOREM. For any positive x and y 
(19) log (xy) = log x + log y. 
PROOF. We write the addition theorem in the form 


log (xy) — log y = log x 


TY z£ 
Í Li -| Ear 
y v ı u 


where we have deliberately chosen different letters for the variables of 
integration in the two integrals. The equality of the two integrals will 
follow from the fact that the approximating sums have the same value 
for suitable choices of subdivisions and of intermediate points. Assume 
at first x > 1. Then 


or 


Í Liy =lim > ee 


1 U n>n i=l; 


where uy = 1, uy, Ug, .. ., U, = x represent the points arising in a sub- 
division of the interval [1, x] and é, lies in the ith cell. Putting v; = yu;, 
n, = yë; we see that the points to, v,,..., U, correspond to a sub- 
division of the interval [y, xy] with intermediate points n, = y. 
Obviously, 
Av, = y Au,, 
so that 
n n 1 
> — Av, = > — Au,. 
GE, 


i=1 N; i 


For n tending to infinity we obtain the desired identity between integrals 
for the case x > I. 

For x = 1 the addition theorem holds trivially, since log 1 = 0. 
To prove the theorem also for the case 0 < x < 1, we observe that then 
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1/x > 1, and hence 


log x + log y = log x + log (4 z) 
x 
1 
= log x + log — + log (zy) 
r 
1 
= log — + log x + log (xy) 
x 


= log (4 z) + log (xy) 
x 


= log 1 + log (xy) = log (xy). 


This completes the proof of the addition theorem. 


Ch. 2 


A proof of the addition theorem can also be based on formula (3) 


(p. 134), according to which 
logz = limn(W« — 1). 


Then _ E 
log (xy) = lim n( Vay — 1) 
n—> W 
= lim [a(z — 1) y +y — 1) 
n—> © 
= [lim n(x — 1)) (lim Wy) + lim n(V’y — 1) 
n—> D n—> © n—> 0 
= logz + logy, 


since lim Vy = 1 (see p. 64). 


n— co 


Applying the addition theorem to the special case y = 1/2 leads to 


log 1 = log x + ioe 
x 
or 
1 
(20) log - = —log z. 
x 
More generally then 


(21) log Y log y + log l = log y — log z. 
x ' T 


Repeated application of the addition theorem to a product of n 


factors yields 


log (7,2,° © 2,) = log z, + log za +--+ + log z,. 
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In particular, we find that for any positive integer n 

(22) l log (x") = n log x. 

This identity also holds for n = 0, since x° = I, and can be extended 
to negative integers n by observing that 


log (x”) = log (+) = —log (17”) = —(—n) log x = n log z. 
t 


For any rational « = m/n and any positive a we can form a* = 
a™'” = x, We have then 


m 


1 m m 
log x = — log x” = -log a™ = — log a = « loga. 
n n n 


Thus the identity 


(23) log (a*) = « log a 


holds for any positive real a and any rational «. 


2.6 Exponential Function and Powers 
a. The Logarithm of the Number e 


The constant e obtained on p. 79 as the limit of (1 + I/n)” plays a 
distinguished role for the function logx. Indeed, the number e is 
characterized by the equation’ 


loge = I. 


For the proof we observe that the continuity of the function log x 


implies 
log e = log flim ( + +) | = lim log (1 + 1} l 
n— æ n n> w n 


= lim n log ( + 1) 
n 


n> W 


Now by the mean value theorem of integral calculus 


` l+1/n 
jog (1 ++) =f Tire 
n 1 u En 


1 This means geometrically that the area bounded by the hyperbola y = 1/x and the 
lines y = 0, z = 1, and z = e has the value one (see Fig. 2.18). 
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where & is some number between 1 and 1 + 1/n which depends on the 
choice of n. Obviously, lim € = 1 so that 


n— ce 


(24) log e = lim -Pe 


n> Q 


b. The Inverse Function of the Logarithm. The Exponential Function 
From the relation log e = 1 it follows that for any rational « 
log (e*) = aloge = xq. 


This shows that every rational number « occurs as a value of log x 
for some positive x. Since log x is continuous, it assumes then any value 
intermediate between two rational values; this means all real values. 
It follows that for x varying over all positive values the values of 
y = logz range over all numbers y. Since log is monotonically 
increasing, there exists for any real y exactly one positive x such that 
log x = y. The solution x of the equation y = log x is given by the 
inverse function of the logarithm which we shall denote by x = E(y). 
We know then that E(y) (Fig. 2.20) is defined and positive for all y, 
and again continuous and increasing (see p. 45) 


y 


Figure 2.20 The exponential function. 
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Since the equations y = log x and xz = E(y) stand for the same relation 
between'x and y, we can write the equation « = log (e*), which is valid 
for rational «, also in the form 


E(a) = e. 


We see: for any rational « the value of E(x) is the ath power of the 
number e. For rational « = m/n the power e” is defined directly as 


Ve”. For irrational « the expression e% is defined most naturally by 
representing « as the limit of a sequence of rational numbers «, and 
putting e* = lim (e*"). Since e*" = E(«,,) and since the function E(y) 


n-»* 


depends continuously on y, we can be sure that the limit of the e** 
exists and that it has the value E(«) independently of the special 
sequence used to approximate «. This proves that the equation 
E(x) = e* holds for irrational « as well. For all real « we can now 
write e* instead of E(a). We call e” the exponential function. This 
function is defined and continuous for all x, is increasing, and positive 
everywhere. 

Since the equations y = log x and x = e” are two ways of expressing 
the same relation between the numbers x and y, we see that log x, the 
“natural logarithm” of x (as defined here by an integral) stands for 
the /ogarithm to the base e, as that term would be used in elementary 
mathematics; that is, log x is the exponent of that power of e which is 
equal to x or 


(25) elk T = 7. 


We can write’ log x = log, x. 
Similarly, x = e” is that number whose logarithm is y, or 


(26) log œ” = y. 


From the point of view of calculus it is really easier to introduce 
natural logarithms first as integrals of the simple function y = I/z, as 
we did here, and to define powers of e by taking the inverse of the 
logarithm function. In this way the continuity and monotonicity of the 
functions log x and e” arise just as consequences of general theorems 
and require no special arguments. 


1 The reader may feel that the name ‘‘natural logarithm” should have been reserved 
rather for logarithms to the base 10. However, historically the first table of log- 
arithms published by Napier in 1614 essentially gave logarithms to the base e. 
Logarithms to the base 10 were introduced only subsequently by Briggs because of 
their obvious computational advantages. 
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c. The Exponential Function as Limit of Powers 


Originally we obtained the number e as the limit 
1\" 
e = lim ( + 1) ; 
n> oo n 
A more general formula represents e” for any x as a limit 
; x\" 
(27) e” = lim (i + z) ; 
n> o n 


For the proof it is sufficient to show that the sequence 
S, = log (1 + z) 
n 


has the limit x. For then the sequence of values 


e°” = (1 +4) 
n 


must tend to e” since the exponential function is continuous. Now 


l+a/n 
s= nlog (1+ =) = nf l ae 
n 1 È 


By the mean value theorem of integral calculus we have 


RES 
a n on 


where „ is some value between one and 1 + z/n. Since obviously 
Èn tends to one for n tending to œ, we have indeed lim s, = zx. 


n> © 


d. Definition of Arbitrary Powers of Positive Numbers 


Arbitrary powers of any positive numbers can now be expressed in 
terms of the exponential and logarithmic functions.? 
We found for rational « and any positive that the relation 


log (z*) = a log x 
holds. We write this equation in the form 


a a l0g æ 


t =e 


1 This obviates the more clumsy “‘elementary”’ definition and justification of these 
processes by passage to the limit from rational exponents indicated on p. 86. 
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For irrational « we again represent « as limit of a sequence of rational 
numbers «,, and define 


; : 1 
x* = lim 2" = lim e" 8". 


n —> © n> 0 


The continuity of the exponential function implies again that the 
limit exists and that it has the value e*!°**, since 


a log z elim (a, log r) 


) lo 
e = = lim e” °F 7, 


Hence the equation 


(28) a = eF 
holds quite generally for any « and any positive x. Putting log x = B 
or, what is the same, x = e? we infer 
(29) (Py = e, 
and more generally then for any positive x 
(x9)? = (et 08 #0 = eP RE _ gab, 


Another rule for working with powers which is easily established in 
complete generality, is the multiplication law 


rxtx’ = xP 


9 


where x is a positive number and « and f are arbitrary. It is sufficient 
to prove the corresponding formula obtained by taking the logarithms 


of both sides: 
as log (x*a®) = log (x7**), 


Now by the rules (19), (26), and (28) already established it follows that 
log (xtx") = log x + log z’ = log (e7'™®*) + log (ef 1%”) 


= q log x + f log x = (a+ f) log x 
= log (eP log 2) = log (2774). 


e. Logarithms to Any Base 


It is easy to express logarithms to a base other than e in terms of 
natural logarithms. If for a positive number a the equation x = a” is 
satisfied, we write 

y = log, x. 


Now a” = e” 84 so that x = e” 8% or y log a = log x. It follows that 


(30) bean e 
log a 
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where log x is the natural logarithm to the base e. In particular, the 
common logarithms to the base 10 are given by 


log x 
log 10 ` 


10819 & = 


Since logarithms to any base a are proportional to natural log- 
arithms, they satisfy the same addition theorem: 


log, x + log, y = log, (xy). 


2.7 The Integral of an Arbitrary Power of x 


In Section 2.2 we obtained the formula 


b bet} xl 
— a 
f u* du N 
a a+ 1 


for any rational « # —I. (The case « = —1 was seen to lead to the 
logarithm.) To evaluate the integral when « is an irrational number, it 
is sufficient to discuss the indefinite integral 


f(a) = “ut du 


from which all definite integrals with positive limits a and b can be 
obtained. Assume x > I (the case x < 1 can be handled in the same 
fashion after interchanging the limits). We have then by (28) 


uz = e” log jA 


where log u > 0 for u in the interval of integration. Let p and y be any 


two rational numbers different from — 1 for which 


Beacy. 
Then also 
Blogu<alogu< y logu. 


Since the exponential function is increasing, this implies 


ef logu < g” 108 u < o? 108 n, 
that is, 
uf < u <u’. 
We have then 


fo du < (x) < | wau. 
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The integrals of u’ and u” were evaluated before, leading to 


Eph l 
TE DSDS 


If we now let the rational numbers f and y converge to «, we obtain in 
the limit 


(xt! — 1). 


Í 
x+ 1 
since ght) = e(B+1)log x and wt! = elit log z tend to e(atllog « = y+ 


because of the continuity of the exponential function. The same result 
follows for x between zero and one. Thus generally for positive a, b 


E ant, 1), 


p(x) = 


bL 
i| u* du = ġ(b) — g(a) = == (b=! — a") 
a Q + 1 
just as for rational a. 

When « is a positive integer, the formula remains valid even when the 
limits a or b become zero or negative; it is easy to extend the formula 
directly to those cases. 


2.8 The Derivative 


The concept of the derivative, like that of the integral, has an 
immediate intuitive origin and is easy to grasp. Yet it opens the door 
to an enormous wealth of mathematical facts and insights; the student 
will only gradually become aware of the variety of significant appli- 
cations and of the power of the techniques which we shall develop in 
this book. 

The concept of derivative is first suggested by the intuitive notion of 
the tangent to a smooth curve y = f(x) at a point P with the coordinates 
x and y. This tangent is characterized by the angle « between its 
direction and the positive x-axis. But how does one obtain this angle 
from the analytical description of the function f(x)? The knowledge of 
the values of x and y at the point P does not suffice to determine the 
angle « since there are infinitely many different lines besides the tangent 
passing through P. On the other hand, to determine « one does not 
need to know the function f(z) in its total over-all behavior; the 
knowledge of the function in an arbitrary neighborhood of the point P 
must be sufficient to determine the direction «, no matter how tiny a 
neighborhood is chosen. This indicates that we should define the 
direction of the tangent to a curve y = f(x) by a limiting process, as 
we Shall presently do. 
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The problem of calculating the direction of tangents, or of ‘‘differen- 
tiation,” was impressed on mathematicians as early as the sixteenth 
century by optimization problems, that is, questions of maxima and 
minima arising in geometry, mechanics and optics. (See the discussion 
in Section 3.6.) 

Another problem of paramount importance which leads to differen- 
tiation is that of giving a precise mathematical meaning to the intuitive 
notion of velocity in an arbitrary nonuniform motion (see p. 162). 

We shall start with the problem of describing the tangent to a curve 
analytically by a limit process. 


a. The Derivative and the Tangent 


Geometric Definition. In conformity with naive intuition, we define 
the tangent to the given curve y = f(x) at one of its points P by means 


y 


Figure 2.21 Secant and tangent. 


of the following geometrical limiting process (Fig. 2.21). We consider a 
second point P, near P on the curve. Through the two points P, P, 
we draw a straight line, a secant of the curve. If now the point P, moves 
along the curve towards the point P, then the secant is expected to 
approach a limiting position which is independent of the side from 
which P, tends to P. This limiting position of the secant is the tangent; 
the statement that such a limiting position of the secant exists is 
equivalent to the assumption that the curve has a definite tangent or a 
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definite direction at the point P. (We have used the word “assumption” 
because we have actually made one. The hypothesis that the tangent 
exists at every point is by no means true for all curves representing 
simple functions. For example, any curve with a corner or vertex at a 
point P does not have a uniquely determined direction there, such as the 
curve defined by y = |z| at (0, 0). (See the discussion on p. 166.) 


y 


Figure 2.22 


Since our curve is represented by means of a function y = f(z), 
we must formulate the geometric limiting process analytically, with 
reference to f(x). This analytical limit process is called differentiation 
of f(x). 

Consider the angle which a straight line makes with the z-axis as 
the one through which the positive z-axis must be turned in the positive 
direction or counterclockwise! in order to become for the first time 
parallel to the line. (This would be an angle «in the interval0 < « < v.) 
Let «, be the angle which the secant PP, forms with the positive x-axis 
(cf. Fig. 2.22) and « the angle which the tangent forms with the positive 
x-axis. Then 


lim a, = q, 
P> P 


1 That is, in such a direction that a rotation of 7/2 brings it into coincidence with the 
positive y-axis. 


158 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


where the meaning of the symbols is obvious. Let x, y and z, y, be the 
coordinates of the points P and P, respectively. Then we immediately 
have? 

ZY _ f(%) — f(x) | 


tan a 


thus our limiting process (disregarding the case « = 7/2 of a perpen- 
dicular tangent) is represented by the equation 


f(x) — f(x) 


lim -———— = lim tan a, = tan a. 
G17? 2 Ly — T? “7-2 


Notation. The expression 


f(a) — f) yr — y _ Ay 


t= a2 Ar 


we call the difference quotient of the function y = f(x) where the symbols 
Ay and Ax denote the differences of the function y = f(x) and of the 
independent variable x. (Here, as on p. 124, the symbol A is an 
abbreviation for difference, and is not a factor.) The trigonometric 
tangent of «, the “slope” of the curve,? is therefore equal to the limit 
to which the difference quotient of our function tends when z, tends 
to x. 

We call this limit of the difference quotient the derivative? of the 
function y = f(x) at the point x. We shall generally use either the 
notation of Lagrange, y’ = f'(x), to denote the derivative, or, as 
Leibnitz did, the symbol* dy/dx or df(x)/dx or (d/dx) f(x). On p. 171 
we shall discuss the meaning of Leibnitz’s notation in more detail; 
here we point out: The notation f'(x) indicates the fact that the 
derivative is itself a function of x since a value of f'(x) corresponds to 
each value of x in the interval considered. This fact is sometimes 
emphasized by the use of the terms derived function, derived curve. The 
definition of the derivative appears in several different forms: 


roam OE a Oe 


zır tı— Trt h>0 h 


1 In order that this equation may have a meaning, we must assume that both x and 
x, lie in the domain of f. In what follows, corresponding assumptions will often 
be made tacitly in the steps leading up to limiting processes. 

? The word gradient or direction coefficient is used occasionally. 

3 The term differential coefficient żs also used in older textbooks. 

4 Cauchy’s notation Df(x) and Nzwton’s notation y are also used. 
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where in the second expression x, is replaced by x +h, or in 
Leibnitz’s ‘notation, 


OY) i ca ip I 5 iy OY 


dx dx zı ty — 2x Azo Ax 


If f is defined in a neighborhood of the point x, then the quotient 
[f(z + h) — f(x)]/h is defined as a function of h for all values h # 0 
for which |A| is sufficiently small to ensure that x + A is in the interval 
under consideration. The definition of f'(x) as a limit requires that 

f(x +h) — f(x) 
h 
or negative) for which |A| is sufficiently small. 


-ro is arbitrarily small for all A ¥ 0 (positive 


Analytic Calculation of Derivatives. The intuitive concept and the 
general analytic notion of derivative are simple and straightforward. 
Less obvious is the procedure of actually carrying out such limiting 
processes. 

It is impossible to find the derivative merely by putting z, = x in 
the expression for the difference quotient, for then the numerator and 
denominator would both be equal to zero and we would be led to the 
meaningless expression 0/0. Thus the passage to the limit in each case 
depends on certain preliminary steps (transformation of the difference 
quotient). 

For example, for the function f(x) = x? we have 


fia)—f) _ at? 


X,— x Ly — 2 


= 2, + x whenever qt # zt. 


This function x, + x does not have exactly the same domain as 
(x,2 — x*)/(x, — x): The function x, + 2 is defined at the one point 
x, = «x, where the quotient (7,2 — x?)/(z, — x) is undefined. For all 
other values of x, the two functions are equal to one another; hence in 
the passage to the limit, for which we specifically require that x, # x, 
we obtain the same value for lim (2? — x)/(x, — x) as for lim (7, + 2). 


> e172 
However, since the function x, + x is defined and continuous at the 
point x, = x, we can do with it what we could not do with the quotient, 
namely, pass to the limit by simply putting x, = x. For the derivative 
we then obtain 


ey x C 94 
f(a) = 2 = 2x. 


160 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


As another example we differentiate, that is, calculate the derivative 
of the function y = Vz for x > 0. We have for tea 


fla) -—f@) Va yz Va — VJ + V2) 


es eee E — O 


a a a (z; — NEA T Jz) 
ge Saeed 
(x, — (V21 + V2) Jat Jz 
Hence (for x > 0) eal 
age 
dyz = lim a an = are 
dx ma ty + fa Det 
For x =0 we have a singularity: The derivative is infinite, since 
(Ja, — 0)/(, — 0) = 1/s/x, > œ for x, +0. 


Analytic Definition 


It is extremely significant that the process of differentiating a function 
has a definite analytic meaning quite apart from the geometric intuitive 
conception of the tangent. The analytic definition of the integral, freed 
from the geometric visualization of area, allowed us to base the notion 
of area on that of integral. In a similar spirit, independently of the 
geometrical representation of a function y = f(x) by means of a curve, 
we define the derivative of the function y = f(x) as the new function 
y = f'(x) given by the limit of the difference quotient Ay/Az provided 
that the limit exists. 

Here the differences Ay = y, — y = f(x,) — f(x) and Ax =a, — z 
are “corresponding changes” in the variables y and x. The ratio 
Ay/Az can be called the “average rate of change” of y with respect to x 
in the interval (x, x + Az). The limit f'(x) = dy/dx represents then the 
“instantaneous rate of change” or simply the “rate of change” of y 
with respect to x. 

If this limit exists, we say that the function f(x) is differentiable. 
We shall always assume that every function dealt with is differentiable 
unless specific mention is made to the contrary.1 We emphasize that 
if the function f(x) is to be differentiable at the point x the limit as 
h— 0 of the quotient [f(z + h) — f(x)]/h must exist, where A can 
have any value # 0 for which x + h belongs to the domain of f. If, 
in particular, f is defined in a whole interval containing the point x 
in its interior, then the limit must exist independently of the manner in 


1 Examples in which this assumption is not satisfied will be given later (see p. 167). 
Such examples justify mentioning differentiability as an assumption if the context 
warrants it. 
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which h tends to zero, whether it be through positive values or through 
negative values, without restriction upon sign. 

Having now an analytic definition for the derivative f'(x), we take 
the direction angle « to the positive z-axis given by the equation 
tan a = f'(x) as the direction of the tangent to the curve at the point 
(x, y). By thus basing the geometric definition on the analytic one we 
avoid the difficulties which might arise from the vagueness of the 
geometric visualization. In fact, we have now defined precisely what 
we mean by a tangent to the graph of y = f(x) at a point (x, y), and 
we have an analytic criterion for deciding whether or not a curve has a 
tangent at a given point (2, y). 


Monotone Functions 
Nevertheless, the visual interpretation of the derivative as the slope 
of the tangent to the curve is a highly useful aid to understanding, even 


in purely analytic discussions. A case tn question is the following 
statement based on geometric intuition: 


The function f(x) is monotonically increasing when f'(x) > 0 and mono- 
tonically decreasing when f'(x) < 0. 


y 


Figure 2.23 Tangents to graphs of increasing and decreasing functions. 


Indeed, if f'(x) is positive and the curve is traversed in the direction of 
increasing x, then the tangent slants upwards, that is, toward increasing 
y (æ is an “acute angle”); therefore at the point in question the curve 


1 The angle « is not determined quite uniquely but can be replaced by « + 7, 
a + 2r, etc., unless we specify as above that 0 < a < ~. 
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rises as x increases; if, on the other hand, f'(x) is negative, the tangent 
slants downwards (« is an “obtuse angle”) and the curve falls as x 
increases (see Fig. 2.23). Analytically this will be proved on p. 177. 


b. The Derivative as a Velocity 


The need to replace the intuitive concept of velocity or speed by a 
precise definition leads once again to exactly the same limiting process 
we have already called differentiation. 

Consider the example of a point moving on a straight line, the 
directed y-axis, the position of the point being determined by a single 
coordinate y. This coordinate y is the distance, with its proper sign, of 
our moving point from a fixed initial point on the line. The motion is 
given if we know y as a function of the time t: y = f(t). If this function 
is a linear function f(t) = ct + b, we speak of a uniform motion with 
the velocity c, and for every pair of distinct values ¢ and ¢, we can obtain 
the velocity by dividing the distance traversed in a time interval by the 
length of that time interval: 


_ f(t) -SO 
as a 


The velocity is therefore the difference quotient of the function ct + b, 

and this difference quotient is independent of the particular pair of 

instants which we fix upon. But what are we to understand by the 

velocity of motion at an instant ¢ if the motion is no longer uniform? 
To answer this question we consider the difference quotient 


[f(t) — AO — t), 


which we shall call the average velocity in the time interval between 
t, and ft. Now if this average velocity tends to a definite limit when we 
let ¢, tend to f, we shall define this limit as the velocity at the time 1. 
In other words: the velocity, that is, the instantaneous rate of change of 
distance with respect to time at the time t, is the derivative 


Ah) — J) 
ney t — 
Newton emphasized the Pe of derivatives? as velocity, and 
wrote ¥ or f (x) instead of f(t), a notation which we shall occasionally 


use. Again, the differentiability of the function is a necessary assump- 
tion if the notion of velocity is to have a meaning. 


1 Called by him “‘fluxions.” 
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A simple example is the motion of freely falling bodies. We start 
from the experimentally established law that the distance traversed in 
time ¢ by a freely falling body starting from rest at tf = 0 is proportional 
to 7’; it is therefore represented by a function of the form 


y = f(t) = ar 


with constanta. As on p. 159, the velocity then is given by the expression 
f(t) = 2at; thus: the velocity of a freely falling body increases in 
proportion to the time. 


c. Examples of Differentiation 


We now illustrate the technique of differentiation by a number of 
typical examples. 


Linear Functions 


For the function y = f(x) = c with constant c we see for all z, 
I(x +h) —f(r)=c—c =0, so that lim [f(x + h) — f(x)}/h = 0; 


h--+0 
that is, the derivative of a constant function is zero. 


For a linear function y = f(x) = cx + b, we find 


f'a) = limp EDI) = lim” = 


h>0 h h>0 


The derivative of a linear function is constant. 
Powers of x 
Next, we differentiate the power function 


y = f(x) = 2, 
at first assuming that « is a positive integer. Provided x, # x, we have 
f(a) ~f(*) _ aia a aga ae eee ae 
LY, x XY, — t 
where we divide directly or use the formula for the sum of a geometric 
progression. This simple algebraic manipulation is the key to the 
passage to the limit; for the last expression on the right-hand side of 
the equation is a continuous function of 2, in particular for x; = x, 
and so we can carry out the passage to the limit x, — x for this expres- 
sion simply by replacing x, everywhere by x. Each term then takes the 
value x*“}, and since the number of terms is exactly «, we obtain 


y= f(a) =P = a 
dx 
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We arrive at the same result if « is a negative integer —8; we must, 
however, assume that x is not zero. We then find 


l 1 
fad- SD _ a L af 1 
ian E E ce E ie 


xfx’ 


Once again we can carry out the passage to the limit simply by sub- 
stituting x for x,. Then just as before we obtain for the limit 


B- 
x 
Kans I —B—-1 
y = — Bp — = —fx 

gP P 
Hence for negative integral values « = —f the derivative is again 
given by the formula 

y = ant), 


Finally, we shall prove the same formula where x is positive and « 
any rational number. We suppose that « = p/q, where p and q are 
both integers and, moreover, positive. (If one of them were negative, 
no essential changes in the proof would be needed; for « = 0 the result 
is already known, since x* is then constant.) We now have 


f(a) — f(@) ta 2" 
XT — Tt 7 Ly — rt l 
If we now put x!” = & and x}? = &,, we obtain 
ORCA ct AC) Wal lel i ee 
tT — x Fe e E 
After this last transformation we can immediately perform the passage 
to the limit z, — x (or what amounts to the same thing, €, —> &), and 
thus obtain for the limiting value the expression 


— — -1 
y = = =~ & a Po aja P (v/a) 


LZ 
Yir 
Q 
| 
~ 
NA 
A 
AQ 


or finally, 


which is formally the same result as before. We leave it for the reader 
to prove for himself that the same differentiation formula holds also 
for negative rational exponents. 
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We shall come back (p. 186) to the differentiation of powers and 
prove the’ general validity of the preceding formula for arbitrary 
exponents a. 


Trigonometric Functions 


As a last example we consider the differentiation of the trigonometric 
functions sin x and cos x. We use the elementary trigonometric addition 
formula to transform the difference quotient 


sin (x + h) — sinx _ sin zcos h + cos x sin h — sin z 
h h 
. cosh—1 sin h 
OE ee 


Recalling the relations of Section 1.8, pp. 84-85, 


w l r TA 2 
h>0 h-0 


we immediately obtain 
d(sin x 
l l d(sin x) = COS T. 
dx 


The function y = cos x can be differentiated in exactly the same way. 
Starting with 


cos (x + h) — cos x cosh — 1 . sinh 
a ge oa reo gr TNE 


and taking the limit as A —> 0, we obtain the derivative? 


„ — Acos x) _ 
i dx 


— sin x. 


d. Some Fundamental Rules for Differentiation 


Just as in the case of the integral, there exist certain basic rules for 
differentiation that follow immediately from the definition and suffice 
for forming the derivative for many functions. 


1. If d(x) = f(x) + g(x), then ġ'(x) = f'(x) + g'(x). 
2. If w(x) = cf(x) (where c is a constant), then y'(x) = cf (x). 


1 If x is interpreted as an angle, then these simple formulas for the derivatives of 
sin x and cos x presupposes, of course, that the angle x is measured in radians. 
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We have 


g(x +h)— d(x) f(e+h)—f(r) g(x +h) — g(x) 
Se = s A 
h h h 
and 
WEED =E SETI) 
h h 


and our statements follow directly by passage to the limit. 


Thus, for example, the derivative of the function d(x) = f(x) + 
ax + b (where a and b are constants) is given by the equation 


p(x) = f(x) +a. 


With the help of these rules and of the formula for the derivative of a 
power we can immediately differentiate any polynomial y = ayx” + 
ax”! +--+ +a, and find 


yo = nat” 1+ (n — lax”? + +--+ 2a, or + ap- 


e. Differentiability and Continuity of Functions 


It is useful to know that differentiability is a stronger condition than 
continuity: 


If a function is differentiable it is automatically continuous. 


For if the difference quotient [f(x + h) — f(a)]/h approaches a 
definite limit as h tends to zero, the numerator of the fraction, that is, 
f(x + h) — f(x) must! tend to zero with h; this just expresses the 
continuity of the function f(x) at the point x. Hence, separate cumber- 
some continuity proofs are unnecessary for functions that can be shown 
to be differentiable (that is, for most functions we shall encounter). 


Discontinuities of the Derivative-Corners 


The converse, however, is false; it is zot true that every continuous 
function has a derivative at every point. The simplest counter-example 
is the function f(x) = |x|, that is, f(x) = —x for x < 0 and f(r) =a 
for x > 0; its graph is shown in Fig. 2.24. At the point x = 0 this 
function is continuous, but has no derivative. The limit of 
[f(z + h) — f(x)]/h is equal to | if A tends to zero through positive 


1 Since then i 
lim y¢ + A) =f% = [I 


h-+0 h—0 


ime TR AO (lim h) = f(x) -0 = 0. 


h—0 
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values, and is equal to — I if A tends to zero through negative values; 
if we do not restrict the sign of A, no limit exists. We say that our 
function has different forward and backward derivatives at the point 
x = 0, where by forward derivative and backward derivative we mean 
respectively the limiting values of [f(x + A) — f(x)]/A as A approaches 
zero through positive values only and negative values only. The 
differentiability of a function defined in an interval about the point 


y 


0 
Figure 2.24 f(x) = |x]. 


considered thus requires not merely that the forward and backward 
derivatives exist, but that they are equal. Geometrically the inequality 
of the two derivatives means that the curve has a corner. Differenti- 
ability expresses in a precise way what intuitively would be called 
smoothness of the graph of the function. 


Infinite Discontinuities 


As further examples of points where a continuous function is not 
differentiable we consider the points where the derivative becomes 
infinite, that is, the points at which there exists neither a forward nor a 
backward derivative, the difference quotient [f(z + h) — f(x)]/h 
increasing beyond all bounds as h — 0. For example, the function 
y = f(x) = Wx = x'* is defined and continuous for all values of z. 
For all nonzero values of x its derivative is given (p. 164) by the formula 
y’ = 4x, At the point =0 we have [f(x +h) —f(x)]/h = 
h/h = h **, and we see at once that as h — 0 the expression has no 
limiting value, but, on the contrary, tends to infinity. This state of 
affairs is often briefly described by saying that the function possesses an 
infinite derivative, or the derivative infinity, at the point in question; 
as we should remember, however, this merely means that as / tends to 
zero the difference quotient increases beyond all bounds, and that the 


168 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


Figure 2.25 


derivative in the sense in which we have defined it really does not exist. 
The geometrical meaning of an infinite derivative is that the tangent 
to the curve is vertical (cf. Fig. 2.25). 

The function y = f(x) = N x, which is defined and continuous for 
x > 0, is also not differentiable at the point x = 0. Since y is not defined 
for negative values of x, we here consider the right-hand derivative 
only. The equation [f(4) — f(0)]/A = 1V h shows that this derivative 
is infinite; the curve touches the y-axis at the origin (Fig. 2.26). 

Finally, in the function y = Wa? = x% we have a case in which the 
right-hand derivative at the point x = 0 is positive and infinite, whereas 
the left-hand derivative is negative and infinite, as follows from the 


relation 
f(r) — FO) _ 1 
h h 
As a matter of fact, the continuous curve y = x”, the so-called semi- 


cubical parabola or Neil’s parabola, has at the origin a cusp with a tangent 
perpendicular to the x-axis (cf. Fig. 2.27). 


y 


Figure 2.26 


Sec. 2.8 The Derivative 169 


O 
Figure 2.27 


f. Higher Derivatives and Their Significance 


The graph of the derivative f'(x) of a function is called the derived 
curve of the graph of f(x). For example, the derived curve of the 
parabola y = 2? is a straight line, represented by the function y = 2x. 
The derived curve of the sine curve y = sin x is the cosine curve y = 
cos x; similarly, the derived curve of the curve y = cos x is the curve 
y = —sin x. (These latter curves can be obtained from each other 
by translation in the direction of the z-axis, as is shown in Fig. 2.28.) 

It is quite natural to form the derived curves of the derived curves, 
that is, to form the derivative of the function f'(x) = (x). This 
derivative 


h-0 h 


provided that it exists, is called the second derivative of the function 
f(x); we shall denote it by f(z). 

Similarly, we may attempt to form the derivative of f”(x), the so- 
called third derivative of f(x), which we then denote by f"(x). For most 
functions that concern us there is nothing to hinder us from repeating the 
process of differentiation as many times as we like, thus defining an 
nth derivative f‘"(x).!| Occasionally, it will be convenient to call the 
function f(z) its own Oth derivative. 

If the independent variable is interpreted as the time ¢ and the motion 
of a point is represented as previously by the function f(t), the physical 
meaning of the second derivative is the rate of change of the velocity 
f'(t) with respect to time, or, as it is usually called, the acceleration. 
In the example of the freely falling body the distance traveled in the 
time ¢ was given by the function y = f(t) = at?. We found f(t) = 2at 
for the velocity at the time ¢. The acceleration has then the constant 


1 The terms second, third, . . . , nth differential coefficient are also used, or D*f,..., 
D"f (cf. footnote 3, p. 158). 
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value f"(t) = 2a (which is usually identified with the gravitational 
constant g). Later (p. 236), we shall discuss the geometrical interpre- 
tation of the second derivative in detail. Here, however, we take note of 
the following facts: At a point where f”(x) is positive, f'(x) increases 


f(x) =sin x f' (x) = cos x 
oa” rN 


f(x) =cos x f (x) = -sinx ia 


Figure 2.28 Derived curves of sin x and cos z. 


as x increases; if here f'(x) is positive, the curve becomes steeper for 
increasing x. If, on the other hand, f”(x) is negative, f'(x) decreases as 
x increases, and if f'(x) is positive, the curve becomes less steep as x 
increases. 

Finally, we observe that the higher derivatives may be used to 
define a function. Thus one can characterize the trigonometric func- 
tions by a so-called differential equation involving the function 
and its second derivative. From the formulas (d cos x)/dz = —sin 2, 
(d sin x)/dx = cos x we obtain immediately by differentiating again, 


2 2 
Ta cos X = —COS 2, Ta sin x = —sin x. 
x a 
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Hence if the symbol u stands for either of the functions sin æ or cos z, 
we have the relation (differential equation) 


This differential equation is also clearly satisfied by any linear com- 
bination u = a cos x + b sin x with constant coefficients a, b. We shall 
see on p. 312 that such linear combinations, with arbitrary constants 
a and b, are the only functions u for which u” = —u. 

In all types of applications involving oscillations or wave phenomena, 
such as motions of springs or waves on the surface of water, we are led 
directly from physical considerations to a differential equation of the 
type u” = —u for the physically significant variable u (usually the 
independent variable is rime). It is therefore important to recognize 
that u can be represented simply in terms of trigonometric functions 
(see Chapter 9). 


g. Derivative and Difference Quotient. Leibnitz’s Notation 


In Leibnitz’s notation the passage to the limit in the process of 
differentiation is symbolically expressed by replacing the symbol A 
by the symbol d, motivating Leibnitz’s symbol for the derivative 
defined by the equation 

dy = lim Ay 
dx sz+0oAzx 


If we wish to obtain a clear grasp of the meaning of the differential 
calculus, we must beware of the old fallacy of imagining the derivative 
as the quotient of two “quantities” dy and dx which are actually 
“infinitely small.” The difference quotient Ay/Az has a meaning only 
for differences Ax which are not equal to zero. After forming this 
genuine difference quotient we must perform the passage to the limit by 
means of a transformation or some other device which also in the limit 
avoids division by zero. It does not make sense to suppose that first 
Az and Ay go through something like a limiting process and reach 
values which are infinitesimally small but still not zero, so that Az and 
Ay are replaced by “infinitely small quantities’ or “infinitesimals” 
dx and dy, and that the quotient of these quantities is then formed. 
Such a conception of the derivative is incompatible with mathematical 
clarity; in fact, it is entirely meaningless. For many people it un- 
doubtedly has a certain charm of mystery, always associated with the 
word “‘infinite’; in the early days of the differential calculus even 
Leibnitz himself was capable of combining these vague mystical ideas 
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with a thoroughly clear handling of the limiting process. But today the 
mysticism of infinitely small quantities has no place in the calculus. 

The notation of Leibnitz, however, is not merely suggestive in itself, 
but it is actually extremely flexible and useful. The reason is that in 
many calculations and formal transformations we can deal with the 
symbols dy and dx exactly as if they were ordinary numbers. By 
treating dx and dy like numbers we can give neater expression to many 
calculations which can admittedly be carried out without their use. In 
the following chapters we shall see this fact verified over and over 
again and shall find ourselves justified in making free and repeated use 
of it, provided we do not lose sight of the symbolical character of the 
signs dy and dz. 

*For the second and higher derivatives too, Leibnitz devised a sugges- 
tive notation. He considered the second derivative as the limit of the 
“second difference quotient”? in the following manner: In addition 
to the variable x we consider z, = x + h and z, = x + 2h. We then 
take the second difference quotient, meaning the first difference quotient 
of the first difference quotient, that is, the expression 
1 (2 Zy __ Y- 4) 


= — (Y2 — 2y, + y), 
where y = f(x), y, = f(%), and y; = f(x). Writing h = Az,y, — Y, = 
Ay,, and y, — y = Ay, we may appropriately call the expression in the 


last parentheses the difference of the difference of y or the second 
difference of y and write symbolically? 


Yo — yı + y = Ay, — Ay = AfAy) = Ary. 
In this symbolic notation the second difference quotient is then written 
A*y/(Ax)?, where the denominator is really the square of Az, whereas 


in the numerator the superscript 2 symbolically denotes the repetition 
of the difference process. The second derivative is then expressed by 


"x)= li Af 
O= lim A 


This symbolism for the difference quotient? led Leibnitz to introduce 


1 Here AA = A? is merely a symbol for “difference of difference” or ‘second 
difference.” 

2 As we must emphasize, the statement that the second derivative may be represented 
as the limit of the second difference quotient requires proof. We previously defined 
the second derivative, not in this way, but as the limit of the first difference quotient 
of the first derivative. The two definitions are equivalent, provided the second 
derivative is continuous; the proof, however, will be given only later (see Chapter 
5, Appendix II since we have no particular need of the result. 
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the notation 
dd n d’i 
y" =f"(x) = Z 


ae y” = f"(z) = —, etc., 


for the second and higher derivatives, and we shall find that this 
notation also stands the test of usefulness. 


h. The Mean Value Theorem of Differential Calculus 


The difference quotient involves the values of a function for distinct 
values of x, whereas the derivative at a point tells us nothing about the 
function at any other point; the difference quotient reflects properties of 
the function “‘in-the-large,” while the derivative reflects a local property 
or a property “‘in-the-small.”” We shall often need to derive over-all or 
“global” properties of a function from the local properties given by its 
derivative. For this purpose we utilize a fundamental relation between 
the difference quotient and derivative known as “the mean value 
theorem of differential calculus.” 

The mean value theorem is easily appreciated intuitively. We 
consider the difference quotient 


fC) — f(%2) 2 Af 


of a function f(x), and assume that the derivative exists everywhere in 
the closed interval x, < x < x, so that the graph of the curve has a 
tangent everywhere. The difference quotient is the tangent of the 
angle « of inclination of the secant, shown in Fig. 2.29. Imagine this 
secant shifted parallel to itself. At least once it will reach a position in 
which it is a tangent to the curve at a point between z, and 2, certainly 
at that point x = € of the curve which is at the greatest distance from 
the secant say at x = & Hence there exists an intermediate value & in 
the interval such that 
fle) =F) _ pg 


Tı — To 


This statement is called the mean value theorem of the differential 
calculus? We can also express it somewhat differently by noticing that 


1 This is the customary notation. Writing y” = d®y/(dx)’, y” = d’y|(dx) with 
parentheses, would be somewhat clearer, but is not done ordinarily. 

2 A more appropriate name would be the intermediate value theorem of differential 
calculus. 
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the number & may be written in the form 
E = x, + Oz, — z), 


where all we know about © is that it lies between 0 and 1. Although © 
(or &) generally cannot be specified more exactly, the theorem is 
extremely powerful in application. 

Consider, for example, the case where x is the time and y = f(x) the 
distance of a car from its starting point along a certain road. Then 


y 


Figure 2.29 


f'(x) is the velocity of the car at the time x. If, say, during the first two 
hours (Ax = 2) the driver has covered a distance Af = 120 miles, we 
can conclude from the mean value theorem that at least at one moment 
é during those two hours the driver had a speed of exactly 60 miles 
per hour (provided the velocity exists at every moment). The driver 
cannot claim, for instance, to have traveled all the time at less than 50 
miles per hour. On the other hand, there is nothing to indicate what 
the time & was at which the precise speed of 60 miles per hour was 
attained; it might have been at some time during the first hour or 
during the second hour or on several occasions. 

A precise statement of the mean value theorem is the following: 


If f(x) is continuous in the closed interval x, < x < x, and differentiable 
at every point of the open interval xı < x < x,, then there exists at 
least one value 6, where O < 0 -< 1, such that 


Pn) INO 2 Gi Gk) 


Ta — Tı 
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If we replace x, by x and x, by x + h, we can express the mean 
value theorem by the formula 


f(a + h) — f(x) 
h 


Although it is essential that f(x) should be continuous for all points 
of the interval, including the end points, we need not assume that the 
derivative exists at the end points. 

If at any point in the interior of the interval the derivative fails to 
exist, the mean value theorem is not necessarily true. It is easy to see 
this from the example of f(x) = |z]. 


= f'(é) = f'(x + 0h), rxcEcuth. 


i. Proof of the Theorem 


The mean value theorem is usually derived by reduction to a special 
case which we establish first. 


ROLLE’S THEOREM. If a function (x) is continuous in the closed 
interval x, < x < x, and differentiable in the open interval x1 < £ < ty, 
and if in addition $(x,) = 0 and (x2) = 0, then there exists at least one 
point & in the interior of the interval at which $'(&) = 0. 


Interpreted geometrically, this means that if a curve reaches the 
z-axis at two points, then it must have a horizontal tangent at some 
intermediate point (Fig. 2.30). 


p(x) 


Figure 2.30 


176 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


Indeed, since (x) is continuous in the closed interval [z,, x] there 
exists a greatest value M of ¢(x) and a smallest value m in that interval 
(see p. 101). Since ¢ vanishes in the end points, we must have m < 0 < 
M. If these greatest and least values should be equal, then necessarily 
m= M=0 and ¢(x) = 0 at all points of the interval; then also 
¢ (x) = O in the interval, and hence ¢’(€) = 0 for every £ in the interval. 
Thus we only have to consider the case where m and M are not both 
zero. If, in particular, M is not zero, then M must be positive. There 
exists a point £ of the interval [x;, x4] where (£) = M. Since ¢ vanishes 
in the end points of the interval, the point å must be an interior point. 
Furthermore, ¢(x) < $£) = M for all x in [x,, x,]. Consequently, for 
every number h whose absolute value |A| is small enough, the inequality 
P(E + h) — $(&) < 0 holds. This implies that the quotient 


pE + h) — $E) 
h 


is negative or zero for h > 0 and positive or zero for h < 0. If we 
let h tend to zero through positive values, we find that ¢(¢) < 0, 
whereas for h tending to zero through negative values it follows that 
¢ (&) > 0. Hence ¢'(é) = 0 and we have proved Rolle’s theorem in the 
case M #0. The same argument holds for m # 0. 

To prove the mean value theorem we apply Rolle’s theorem to a 
function which represents the vertical distance between the point 
(x, f(x)) of the graph and its secant: 

EI) 


(x1) (x — x,). 
ai 


plz) = f(x) — f(x) — 


T3 


This function! obviously satisfies the condition ġ(x;) = ¢(7,) = 0, and 
is of the form ¢(x) = f(x) + ax + b with constant coefficients a = 
—[f(x.) — f(2)]/(%. — xı) and b. From p. 166 we know that 


p(x) = f'(x) + a, 
and thus by Rolle’s theorem 
0 = pE =S Ea 


1 This function also is proportional to the distance of the point (x, f (x)) of the curve 
from the secant; the reader can easily verify this for himself, for example, by using 
the fact from elementary analytical geometry that the expression (y — mx — b)/ 


V1 + m represents the (signed) distance of the point (zx, y) from the line with the 
equation y — mx — b = 0. In this way we find that indeed at the points of the curve 
having greatest distance from the secant the tangent is parallel to the secant. 
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for a suitably chosen intermediate value £; hence 


fE) SA _ f) — f(x) : 


thus the mean value theorem is proved. 
Significance of the Theorem 


The derivative of a function had been defined as the limit of difference 
quotients for an interval as the end points approach each other. The 
mean value theorem establishes a connection between difference 
quotients and derivatives of a differentiable function which does not 
involve the shrinking of the interval. Each difference quotient is equal 
to the derivative at a suitable intermediate point £. 


Examples. Just as in the mean value theorem of integral calculus 
there is nothing specific asserted in the intermediate value theorem 
about the location of & beyond the fact that & lies in the interior of the 
interval. For the example of the quadratic function y = f(x) = z? 
with derivative f'(x) = 2x we find 


f(z) — f(z) 


= xy $ To = f'(&), 


where € = (x; + %,)/2 is the midpoint of the interval [x,,z,]. In 
general, however, € might lie anywhere else between x, and x, For 
example, if f(x) = z3, we have [f(1) — f(0)J/(1 — 0) = 1 = f(&) = 
3&2, where & = 1/43. 


Monotonic Functions. As one of many applications of the mean 
value theorem of differential calculus we prove that if the derivative of 
f(x) has a constant sign, then f is monotonic. Specifically, we assume 
f(x) to be continuous in the closed interval [a, b] and differentiable 
at each point of the open interval (a, b). If then f'(x) > O for x in 
(a,b), then the function f(x) is monotonic increasing; similarly, if 
f'(x) < 0, the function is monotonic decreasing. The proof is obvious: 
Let x, and x, be any two values in the closed interval [a, b]. Then there 
exists a € between x, and x,, and hence also between a and b, such that 


S&a) — S (21) = f'(EN(%_ — 2%). 
If f'(x) > 0 everywhere in (a,b) we have in particular f(E) > 0. 
Hence f(x.) — f(x,) is positive for x, > 2,; that is, f(x) is increasing. 
Similarly, f is decreasing if f(x) < 0 in (a, b). 
In the same way we show that a function f(x) continuous in [a, b] 
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and differentiable in the open interval (a, b) must be a constant if f'(x) = 0 
everywhere in (a, b). For then 


F (#2) — S) = S'E — %) = 0. 


This important statement corresponds to the intuitively obvious fact 
that a curve whose tangent at every point is parallel to the z-axis must 
be a straight line which is parallel to the x-axis. 


Lipschitz-Continuity of Differentiable Functions. It was mentioned 
earlier that a function f(x) having a derivative is necessarily continuous. 
The mean value theorem of differential calculus furnishes much more 
precise quantitative information, namely, a modulus of continuity. We 
consider a function f(z) which is defined in the closed interval [a, b] 
and has a derivative f'(x) at each point of that interval. Assume that 
f(x) is bounded in the interval (this is certainly the case provided f'(x) 
is defined and continuous in the closed interval [a, b]); there exists 
then a number M such that | f'(x) < M. For any two values 2, 2, 
in (a, b) we infer from the mean value theorem 


If) — Sa) = IS Ee — tI < M |r — zl. 
For given e > 0 we have thus produced a simple modulus of con- 
tinuity 6 = e/M such that 
| f (x2) — f(x) < € for |x, — 2,| < ô. 


Take, for example, the function f (x) = x? in the interval —a < x < +a. 
Since 


[f'(x)| = |22| < 2a 
we see that here 


|S (£) — f£) < € for |z, — x| < «/2a. 


We said that a function f(x) “satisfies a Lipschitz-condition”? or 
is “Lipschitz-continuous” if there is a constant M such that 


|f (2) — f(%)| < M |x, — z| 
for all x,, x, in question. This means that all difference quotients 
F(z) — f(%) 
Aa | 


have the same upper bound M for their absolute value. We see that 
every function f with continuous derivative f” on a closed interval is 
Lipschitz-continuous. However, even functions that do not have a 
derivative at every point can be Lipschitz-continuous, as the example 
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f(x) = |x| shows. The reader can verify for himself that for this 
function always | f(x) — f(%1)| < |x. — xl. 

On the other hand, not every continuous function is Lipschitz- 
continuous. This is shown by the example of f(x) = 2”; here 


f(x) — (9) 
x-0 


o- 
= y7” 


is not bounded for small x; hence f(x) is not Lipschitz-continuous at 
x = 0. This is consistent with the fact that the derivative f'(x) = 1/3” 
does not remain bounded as x tends to zero. The functions which are 
Lipschitz-continuous form an important class intermediate between 
those that are merely continuous and those that have a continuous 
derivative. 


j. The Approximation of Functions by Linear Functions. 
Definition of Differentials 


Definition. The derivative of a function y = f(x) was defined by 


f'(x) = lim TENE lim Ay 


bå 
n>0 h Ar-0 Ax 


where Az = h. If for a fixed x and a variable 4, we define a quantity 
€ by 
f(x + h) — f(x) 


e(h) = i 


— f(x) =~ — f (2), 


Ay 
Ax 
then the fact that f'(x) is the derivative of f at the point x amounts to 
the equation 

lim e(h) = 0. 

h>0 
The quantity Ay = f(x + h) — f(x) represents the change or increment 
in the value of the dependent variable y that results when the value x 
of the independent variable is changed by the amount Az = h. Since 


Ay = f'(x) Az + e Az, 


the quantity Ay appears as the sum of two parts, namely, a part 
f'(x) Ax which is proportional to Az and a part « Az which can be made 
as small as we please compared to Az by making Az itself small enough. 
The dominant, linear part in the expression for Ay we shall call the 
differential dy of y and write for it 


dy = df(x) = f'(x) Ax. 
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For any differentiable function f and for a fixed x this differential is 
a well-defined linear function of h = Az. For example, for the function 
y = z? we have dy = d(x?) = 2x Ax = 2xh. For the particular function 
y = x whose derivative has the constant value one, we simply have 
dx = Ax. It is then consistent with our definition to write dx for Az 
when a is the independent variable; hence the differential of any 
function y = f(x) can also be written as 


dy = df(x) = f"(a) dz. 
The increment of the dependent variable 
Ay = f'(x) dx + «dx = dy + «dx 


differs from the differential dy by the amount e dx, which in general is 
not zero. In the example of the function y = x? we have dy = 2x dz, 
whereas 


Ay = (x + dx)? — 2? = 2x dx + (dx)? = dy + «dr, 


where e = dz. 

Earlier we used the symbol dy/dx purely symbolically to denote the 
limit of the quotient Ay/Ax for Ax tending to zero. With our present 
definition of the differentials dy and dx the derivative dy/dx can actually 
be considered as the ordinary quotient of dy and dx. Here, however, 
dy and dx are now not in any sense “infinitely small” quantities or 
“infinitesimals;” such an interpretation would be devoid of meaning. 
Instead dy and dz are well-defined linear functions of A = Az which for 
large Az may have large numerical values. There is nothing remarkable 
in the fact that the quotient dy/dx of those quantities has the same value 
as the derivative f(x). This is merely a tautology restating the definition 
of dy as f'(x) dx. 

Rewriting the relation between increment and differential of f in 
the form 

f(x + h) = fle) + hf) + eh, 


we see that the expression f(x + h) considered as a function of A is 
represented by the linear function f(x) + Af’(x) wth an error eh which 
is arbitrarily small compared to A if his sufficiently small. This approxi- 
mate representation of f(x + h) by the linear function f(x) + Af (x) 
means geometrically that we replace the curve by its tangent at the 
point x (see Fig. 2.31). 


1 Similarly, higher-order differentials could be defined by d'y = f"(x)h? = f"(x)(dz)’, 
dy = f"(c)(dx)’, etc., in agreement with Leibnitz’ notation for the higher derivatives. 
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O 
O x x + dx 
Figure 2.31 Increment Ay and differential dy. 


Linear Approximation 


A more precise estimate of the magnitude of the “error,” that is, 
of the deviation of the function f (x) from the linear function representing 
the tangent, 1s given by the mean value theorem of differential calculus. 
We have for a suitable € between x and x + h 


f(a +h) — f(x) = Af), 
so that 
fle th) = f(a) 
h 


If, as usually in applications, the function f'(x) itself has a derivative 
f(x), we find by applying the mean value theorem a second time that 


FE) S E = (F xf"), 


where y is a value intermediate between x and å and hence also between 
xand x + h. It follows that 


lel = IE — DF M = IE — xl If < AM, 


where M is any upper bound for the absolute value of the second 
derivative of fin the interval [x, x + h]. Then |eh|, which measures the 
deviation of f(x + h) from the linear function f(x) + Af'(x), is at 
most Mh?. For sufficiently small h the expression Mh? is, of course, 
much smaller than /’(z)h, unless f'(x) happens to have the value zero. 
This approximation of a function in a small interval by a linear function 


Sœ) = f'E — f'(2). 
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is of greatest significance both for practical applications and for 
advanced mathematical analysis. We shall return to this topic in later 
chapters, and incidentally derive then the better estimate |eh| < 4MA?. 


Interpolation 


“When a function f(x) is described numerically by a table of values, 
f is ordinarily determined by linear interpolation for arguments x 
intermediate between those for which / is listed. This procedure also 
corresponds to replacing the function f by a linear function in an 
interval. In this case the graph of the linear function is given by a 


y 


Figure 2.32 Linear interpolation. 


secant rather than by a tangent to the curve representing f. If, say, the 
values of f are known at two points a and b, we replace f(x) for inter- 
mediate x by the expression 
b a 
Hx) = f(a) + (e — LO 
b—a 
which is linear in 2 and gives the correct values of f at the end points 
x = a and x = b of the interval (see Fig. 2.32). Here again by use of 
the mean value theorem we can estimate the error in this approximation. 


We have 
f(a) — 62) = (x — a) LOL) (2) — f(a) _ f(b) = i) = fla) 
=a b—a 


= (x — a)[f'(E1) — F E2. 


Since &, lies between a and 2, it also lies between a and b, as does 
œ A second application cf the mean value theorem of differential 
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calculus then yields 


I(x) — p(x) = (4 — ay f"(nyé, — Š), 


where 7 is between &, and &, and hence also between a and b. Con- 
sequently, denoting by M an upper bound for |f”| in the interval 
[a, b], we find that 


[f (2) — $) < le — al |$; — Fl POD < M(b — a}. 


Once again the deviation of f from its linear approximation can be 
estimated by the square of the length of the interval. 

As a numerical example we take from a table of trigonometric 
functions in radian measure the values 


sin 0.75 = 0.6816, sin 0.76 = 0.6889, 


where the errors do not exceed 0.00005. If we want to deduce the value 
of the sine function for the intermediate argument 0.754, we find by 
linear interpolation that 


sin 0.754 = 0.6816 + (0.6889 — 0.6816) ~ 0.6845. 


For the function f(x) = sin x the first derivative is f(x) = cos x, the 
second derivative f"(z) = — sin x. Obviously, | f"(x)| < 1, so that the 
error in the value found for sin 0.754 as a result of the linear inter- 
polation procedure does not exceed | x (0.01)? = 0.0001. To this 
error estimate we must add possible errors due to round-off in the 
tabulated values and in the interpolation. 

We can compare this value obtained by linear interpolation with the 
value we would obtain by replacing the sine curve by its tangent at 
the point x = 0.75. Taking f'(0.75) = cos 0.75 = 0.7317 from the 
table, we find 


sin 0.754 ~ {(0.75) + '(0.75)(0.004) = sin 0.75 + 0.004 cos 0.75 
a 0.6845. 


Incidentally, the true value of sin 0.754 correct to six significant digits 
is 0.684560. 


k. Remarks on Applications to the Natural Sciences 


In applying mathematics to natural phenomena we never deal with 
precisely known quantities. Whether a length is exactly a meter is a 
question which cannot be decided by any experiment and which con- 
sequently has no physical meaning. Moreover, there is no immediate 
physical meaning in saying that the length of a material rod is rational 
or irrational; we can always measure it with any desired degree of 
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accuracy by rational numbers, and the only meaningful question is 
whether we can manage to perform such a measurement using rational 
numbers with relatively small denominators. Just as the question of 
rationality or irrationality in the rigorous sense of “exact mathematics” 
has no physical meaning, carrying out limiting processes in applications 
is usually not more than a mathematical idealization. 

The practical—and overwhelming—significance of such idealizations 
lies in the fact that through the idealizations analytical expressions 
become essentially simpler and more manageable. For example, it is 
vastly simpler and more convenient to work with the notion of in- 
stantaneous velocity, which is a function of only one definite instant of 
time, than with the notion of average velocity between two different 
instants. Without such idealization every scientific investigation of 
nature would be condemned to hopeless complications and would 
bog down at the outset. 

We do not intend to enter into a philosophical discussion of the 
relationship of mathematics to reality. For the sake of better under- 
standing of the theory, it should be emphasized that in applications we 
have the right to replace a derivative by a difference quotient and vice 
versa, provided only that the differences are small enough to guarantee 
a sufficiently close approximation. The physicist, the biologist, the 
engineer, or anyone else who has to deal with these ideas in practice will 
therefore have the right to identify the difference quotient with the 
derivative within his limits of accuracy. The smaller the increment 
h = dx of the independent variable, the more accurately can he 
represent the increment Ay = f(x + h) — f(x) by the differential 
dy = hf'(x). As long as he keeps knowingly within the limits of accu- 
racy required by the problem, he might even be permitted to speak of the 
quantities dr = h and dy = hf (x) as “infinitesimals.”” These ‘“‘physi- 
cally infinitesimal” quantities have a precise meaning. They are 
variables with values which are finite, unequal to zero, and chosen 
small enough for the given investigation, for example, smaller than a 
fractional part of a wavelength or smaller than the distance between 
two electrons in an atom; in general, smaller than the degree of 
accuracy required. 


2.9 The Integral, the Primitive Function, and 
the Fundamental Theorems of the Calculus 
a. The Derivative of the Integral 


As already stated, the connection between integration and differen- 
tiation is the cornerstone of the differential and integral calculus. 
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We recall from Section 2.4 that an indefinite integral of a con- 
tinuous function f(x) is defined as a function ¢(x) of the upper end 
point of integration by the formula 


$(2) = | “flu) du, 


where « was any point in the domain of f. We shall now prove 


FUNDAMENTAL THEOREM OF CALCULUS (Part One). The indefinite 
integral ¢(x) of a continuous function f(x) always possesses a derivative 
f(x), and moreover 


$ (x) = f(x). 


That is, differentiation of the indefinite integral of a continuous 
function always reproduces the integrand 


< Í J(u) du = f(x). 


This inverse character of the operations of differentiation and integration 
is the basic fact of calculus. The proof is an immediate consequence of 
the mean value theorem of integral calculus. According to that 
theorem we have for any values x and x + h of the domain of f 


plz + h) — G(x) -Ò flu) du = hf(&), 


where & is some value in the interval with end points x and x + A. 
For Ah tending to zero the value € must tend to x so that 


jim LE + h) = (2) 


h-0 h 


= h E) = f(x), 


since f is continuous. Hence ġ'(x) = f(x) as stated by the theorem. 


Applications. (a) We can use the theorem to find derivatives for 
some of the functions introduced earlier. The natural logarithm was 
defined for x > 0 by the indefinite integral 


log x -f 1 du 
ı u 
It follows immediately that 
dlogx _1 
dr x` 
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(b) More general logarithms to an arbitrary base a were expressible 
in the form 


Applying the rule for the derivative of the product of a constant and of 
a function we find that 


(c) We found that 


dx 


in the case where the exponent « is an integer or more generally a 
rational number. We can now extend this formula to arbitrary «. 
For that purpose we recall the integration formula 


[wv ee (bft! — a+»), 
a f +1 
which we had proved for any positive numbers a, b, and any 6 # —1. 
If we replace here the upper limit b by the variable x and differentiate 
both sides with respect to x, it follows that for z > 0 


d | 


af = — 
dx Bp + 1 
Using the rules for the derivative of a sum and of a constant times a 
function, we can write this result in the form 


(aP = ae): 


Substituting « for f + 1, we obtain the formula 


Gow x’ = ot} 

dx 
for any p Æ —1, that is, for « # 0. However, the formula also holds 
trivially for « = 0 since then z* = | and the derivative of a constant 


is zero. 


b. The Primitive Function and Its Relation to the Integral 


Inverting Differentiation 


The fundamental theorem shows that the indefinite integral ¢(z), 
that is, the integral with a variable upper limit x, of a function f(x), 


Sec. 2.9 Fundamental Theorems of the Calculus 187 


is a solution of the following problem: Given f(x), determine a function 
F(x) such that 
F(x) = f(x). 


This problem requires us to reverse the process of differentiation. It is 
typical of the inverse problems that occur in many parts of mathe- 
matics and that we have already found to be a fruitful mathematical 
method for generating new concepts. (For example, the first extension 
of the idea of natural numbers is suggested by the desire to invert 
certain elementary processes of arithmetic. Again new kinds of func- 
tions were obtained from the inverses of known functions.) 

Any function F(x) such that F'(x) = f(z) is called a primitive function 
of f(x) or simply a primitive of f(x); this terminology suggests that the 
function f(x) is derived from F(x). 

This problem of the inversion of differentiation or of the finding of a 
primitive function at first sight is of quite different character from the 
problem of integration. The first part of the fundamental theorem 
asserts, however: 

Every indefinite integral $(x) of the function f(x) is a primitive of f(z). 

Yet this result does not completely solve the problem of finding the 
primitive functions. For we do not yet know if we have found all the 
solutions of the problem. The question about the set of all primitive 
functions is answered by the following theorem, sometimes referred to 
as the second part of the fundamental theorem of the differential and 
integral calculus 


The difference of two primitive functions F (x) and F,(x) of the same 
function f(x) is always a constant 


F(x) — F(x) = c. 


Thus from any one primitive function F(x) we can obtain all the others 


in the form 
F(x) + c 


by suitable choice of the constant c. Conversely, for every value of the 
constant c the expression F(x) = F(x) + c represents a primitive func- 


tion of f(x). 
It is clear that for any value of the constant c the function F(x) + c 
is a primitive function, provided that F(x) itself is. For we have 


(cf. p. 166) 
d d d Te 
ie) ee a F qe Se ee): 
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Thus to complete the proof of our theorem it remains only to show that 
the difference of two primitive functions F,(x) and F,(x) is always 
constant. For this purpose we consider the difference 


F,(x) — F(x) = G(@). 
Clearly, 
G2) = F'@) — Fr @) =f@) —f@ = 0. 
However we had proved on p. 178 from the mean value theorem of 
differential calculus that a function whose derivative vanishes every- 
where in an interval is a constant. Hence G(x) is a constant c, and the 


theorem follows. 
Combining the two parts just proved we can now formulate the 


FUNDAMENTAL THEOREM OF CALCULUS. Every primitive function 
F(x) of a given function f(x) continuous on an interval can be repre- 
sented in the form 


F(x) = c + ¢(4)=c +] so du, 


where c and a are constants, and conversely, for any constant values of 
a and c chosen arbitrarily: this expression always represents a primitive 
function. 


Notations 


It may be surmised that the constant c can as a rule be omitted 
because by changing the lower limit a we change the primitive function 
by an additive constant; that is, that all primitive functions are 
indefinite integrals. Frequently, however, we cannot obtain all the 
primitive functions if we omit the c, as the example f(x) = 0 shows. 
For this function the indefinite integral will always be zero, independ- 
ently of the lower limit; yet any arbitrary constant is a primitive 
function of f(x) = 0. A second example is the function f(x) = Vz, 
which is defined for nonnegative values of x only. The indefinite 
integral is 

plr) = gr — ga”, 
and we see that no matter how we choose the lower limit a the in- 
definite integral d(x) is always obtained from 4(x)” by addition of a 
constant —%a” which is less than or equal to zero; yet such a function 
as $x + 1 is also a primitive function for Vz. Thus in the general 
expression for the primitive function we cannot dispense with the 
arbitrary additive constant. 


1 As long as a lies in the domain of f. 
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The relationship which we have found suggests extending the 
notion of-the indefinite integral so as to include all primitive functions. 
We shall henceforth call every expression of the form c + ġ(x) = 


c+] f(u) du an indefinite integral of f(x), and we shall no longer 


a 

distinguish between the primitive function and the indefinite integral. 
Nevertheless, if the reader is to have a proper understanding of the 
interrelations of these concepts, it is absolutely necessary to bear in 
mind that in the first instance integration and inversion of differen- 
tiation are two different things, and that it is only the knowledge of the 
relationship between them that gives us the right to apply the term 
“indefinite integral” to the primitive function also. 

It is quite customary to use a notation which is not perfectly clear 
without comment: we write 


F(x) = fy (x) dz, 


when we mean that the function F(x) is of the form 


F(x) = c + [r du 


for suitable constants c and a, that is, we omit the upper limit x, the 
lower limit a and the additive constant c and use the letter x for the 
variable of integration. Strictly speaking, of course, there is a slight 
inconsistency in using the same letter for the variable of integration and 
the upper limit x which is the independent variable in F(x). In using 
the notation f f(x) dx we must never lose sight of the indeterminacy 
connected with it, that is, the fact that the symbol always denotes one 
of the primitive functions of f only. The formula F(x) = f f(x) dz is 
just a symbolic way of writing the relation 


£ F(x) = f(2). 


c. The Use of the Primitive Function for 
Evaluation of Definite Integrals 


Suppose that we know any one primitive function F(z) for the func- 


b 
tion f(x) and that we wish to evaluate the definite integral f(u) du. 
We know that the indefinite integral j 


Jore Í ” J(u) du, 
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being also a primitive of f(z), can only differ from F(x) by an additive 
constant. Therefore 

Hx) = F(x) + c, 
and the additive constant c is determined at once because the indefinite 
integral ¢(x) = | f(u) du must take the value zero when x =a. We 


thus obtain 0 = g(a) = Fla) + c, fromawhich c = —F(a) and (2) = 
F(x) — F(a). In particular, for the vdiue x = b we have the basic 
formula 


Í "f(u) du = F(b) — F(a), 
if Fw) = fl. 


Therefore, 


If F(x) is any primitive function of the continuous function f(x) what- 
soever, the definite integral of f(x) between the limits a and b is equal to 
the difference F(b) — F(a). 


If we use the relation F’(x) = f(x), this consequence of the funda- 
mental theorem may be written in the form 


b b b 
(31) F(b) — F(a) =| F'(x) dx =| “ dx =| dF(2), 
where now F(x) can be any function with a continuous derivative 
F(x), and where we use the suggestive symbolic notation dF(x) = 
F'(x) dx of Leibnitz. 
In applying our rule we often use a vertical bar to denote the 
difference of values at the end points, writing 


| A dx = F(b) — F(a) = F(z) 


b 
a a 


We can write (31) in the form 


F(b) — F(a) _ 1 
b—a b—a 


Recalling the definition of the average of a function in an interval 
from p. 141, the rule states then that the difference quotient of the 
function F(x) formed for the points a and b is equal to the arithmetic 
mean or average of the derivative of F(x) in the interval with end points 
a and b. When we considered the motion of a particle on a straight 
line, we called the change in distance s divided by the change in time ¢ 


(32) Í 'F'(æ) dæ. 
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the “average velocity.” We see now that indeed As/At is precisely the 
averageof the velocities ds/dt for the given time interval if ¢ is the 
independent variable used in forming the average. 


RELATION BETWEEN THE MEAN VALUE THEOREMS 


The formula 
(33) F(b) — F(a) =| FŒ dx 


which holds for any continuous function f and one of its primitives F 
also makes evident the relation between the mean value theorems of 
integral calculus (p. 141) and of differential calculus (p. 173). By the 
mean value theorem of integral calculus we conclude from (33) that 


F(b) — F(a) = (b — a) f (6). 


Since F is a primitive of f, we can replace f(¢) by F’(é) and obtain the 
mean value theorem of differential calculus for the function F. Of 
course, the requirement that F have a continuous derivative is stronger 
than the requirement of the mean value theorem of differential calculus, 
that the derivative merely exist. 


d. Examples 


In Chapter 3 we shall make extensive use of the fundamental theorem 
in evaluating integrals. For the moment we illustrate the method that is 
based on the use of the formula 


b 
Í APE) Fp FG) 
a dx 
by some examples. 
On p. 163 we derived the formula 


for positive integers n. This formula is really a trivial consequence of the 
binomial theorem since 


d 1 


— r” = lim > [(2 + h)" — x" 
dx n=0 h K ) l 
= lim Lar + nh”! + a) h?r? +- + h” — z") 
h>0 2 / 
= lim (no ah a. 1) hx"? foes + D L not. 
h-0 
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Integrating between the limits a and b we find that 


b 
| nx" dx = b” — a”. 


a 


Writing m for n — 1 we obtain the formula 


[= dx = l (b™? = a 
a m + 1 
for integers m > 0. This derivation of the expression for the integral 
of x” is much simpler than the one given on p. 131 which was based on a 
geometric subdivision of the interval [a,b]; moreover, the result is 
now actually more general since we can dispense with the assumption 
that a and b are positive. 
The formulas 
d sin x d cos x ; 
—— = COs r, — = —sinz 


dx dx 
were obtained on p. 165 by applying the addition theorems for trigono- 


À _ {sinh . : 

metric functions and using lim (=) = ]. Integrating we immedi- 
h—>0 

ately obtain 


b b 
[ cos zdz = sin b — sin a, [sin z dz = cos a — cos p, 
a a 

Again this derivation of the integration formulas from the fundamental 
theorem is simpler than the one based on the definition of the definite 
integral as limit of a sum. 


Supplement. The Existence of the Definite Integral 
of a Continuous Function 


We have yet to prove the fact that the integral of a function f(x) 
between the limits a and b (a < b) exists whenever f(x) is continuous in 
the closed interval [a, b]. The proof will be based mainly on the uniform 
continuity of f(x) (see p. 41): for any given positive e the values of f 
at any two points & and y of the interval differ by less than «e if § and 7 
are sufficiently close to each other, the degree of closeness dependent 
solely upon e and independent of $, 7; in other words, there exists a 
uniform modulus of continuity ô(e) such that | f(E) — f(7)| < e for 
any values ¢, 7 in [a, b] for which |E — y| < ò. 
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The definition of integral as a limit of sums requires that we subdivide 
the interval [a, b] into n parts by successive points 2, 2,..., £p, where 
xy =a, z, =b and qy <z <: <z, Let S, be a name for a 
particular subdivision of [a, b] of this type into n cells. The coarseness 
of the subdivision will be measured by the length of the largest of the 
resulting cells, that is, the largest of the quantities Az, = x, — £, 
which we shall call the “span” of S,. Because of the uniform continuity 
of f the values of f in any two points of the same cell differ by less than 
e as soon as the span of S, is less than ô = d(e). An approximating 
sum based on the subdivision S,, is obtained by choosing a value &, in 
each cell [z,_,, x,] and forming 


FG =a LIE) Az,. 


We have to prove that for a sequence of subdivisions S, with span 
tending to zero the sums F,, converge toward a limit, which we shall 


b 
denote by | f(x) dx, and that the value of this limit does not depend on 


a 
the particular choice of subdivisions and of intermediate points &,. 
To carry out the proof we first compare the values F, and Fy belonging 
to two subdivisions S,, and Sẹ where the span of S, is less than ô and 
where the subdivision Sy is a “refinement” of S,,; that is, all points of 
subdivision of S,, occur among those of Sy. We have in appropriately 
modified notation 


N 
Fy = 2 Ip Ay;, 


where the values y, are the points of subdivision of Sy, where Ay; = 
Y; — ¥;-,, and y; lies in the interval [y,_,, y,]. Two successive sub- 
division points z,, and x, of S, also occur among the values y,, say 
£ii = Y, Xi = y, In Sy the cell [z,;_,, x;] is broken up into intervals, 
say [Y,-15Yr]> [Yr Yl +++ > Ys-15 Ys], Making the total contribution 


> SNY; — Y) 


to Fy. We compare this to the contribution of the cell [z,_,, x,] to F, 
given by f(é;)(«; — 2;_,), which can be written as 


È SEY = Yow 


(see Fig. 2.33) and find for the absolute value of the difference of the 
contributions 


$ n) -SENU — pa) < De (yy — ya) = le — tia). 


j=r 
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Hence, adding up the differences of the contributions to F,, and Fy for 
all cells [z,,, z,] of S,, we find the estimate 


Fy — Fal < È ele, — 4.4) = elb ~ a), 


whenever S, has span less than d(e) and Sy is a refinement of S,. 
If now S, and S,, are any two subdivisions, we can consider the 
subdivision Sy formed by all the points of subdivision S, together with 


S 
a Xj-1 $; Xi b : 
Figure 2.33 


all those of S,,. Then Sy will be a refinement of both S, and Sm- 
Assume that both S, and Sm have span less than 6(e). Choosing any 
intermediate points y; of the cells of Sy to define Fy, we find 


Po Fal = Ee FE eS Pe Pl Pe = El 
< 2e(b — a). 


We see then that any two approximating sums differ arbitrarily 
little from each other, if the spans of the corresponding subdivisions 
are sufficiently small. Consider now any sequence of subdivisions S, 
whose spans tend to zero for n—> œ. Let F, be the corresponding 
approximating sums. For any «e > 0 the span of S, is less than 0(e) 
for all sufficiently large n. Hence 


IF, — Fal < 2e6 — a) 


for both n and m sufficiently large. It follows that the sequence F, 
satisfies the Cauchy convergence criterion (see p. 97); consequently, 


lim F,, = F 
i n> w 
exists. 
It remains to show that the value of lim F,, does not depend on the 


n -> 0 
particular subdivisions and intermediate points. If then S,,” denotes 
any other sequence of subdivisions with spans tending to zero, then the 
corresponding sum F,’ has a limit F’. Since 


|En — F,| < 2e(b — a) 


as soon as the spans of S,, and S,, are less than o(e), we find for n — œ 
that also |F — F'| < 2e(b — a). Since here e is an arbitrary positive 
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number, it follows that F = F’. Hence the limit F, which we denote 
b . 
by [70 dx, is uniquely determined. 
a 


The proof of the existence of the definite integral of a continuous 
function is thus complete. 


More General Approximating Sums. Our proof indicates more 
Clearly what is essential in the approximation of an integral by a sum. 
It makes evident the fact that a somewhat more general limiting process 
could be formulated, leading also to the integral, and that the following 
more general form of the theorem is true: f; need not be a function 
value in order that the sums F, = È f; Ax, converge to the integral; it 
suffices instead that | f; — f(&,)| < 6(e) for some point &; in the interval 
[x,_,, 7,], where ô(e)— 0 for e — 0. 

This general statement is often useful. If, for example, f(z) = 
P(x)y(x), then instead of the sum È f(£,) Ax, we may consider the more 
general sum 


> $(8,')y(E,") Ax,, 


where &,' and £,” are two not necessarily coincident points of the cell. 
This sum also tends to the integral 


b b 
[re dx -Í olxrjylx) dx 


as n increases, provided that the length of the longest cell tends to zero. 
A corresponding statement holds for other sums formed in an 
analogous way; for example, the sum 


DV HEP + MEP An, 
v=] 
tends to the integral 


Í VUF + pæ dz. 


To prove these statements we only have to show that the change D in the 
approximating sums due to the deviation of £,” from £, tends to zero in the 
limit. This is obvious in the first example where the change in the approxi- 
mating sum is 

D = Z AEE) — E N Ax, 
v=] 
Since ¢ is bounded and y uniformly continuous, D can be made arbitrarily 
small by choosing sufficiently small cells. 
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The change in the second sum is represented by 


D =S(VoE,F + oP — VEE? + 9,79 Ar, 
v=] 
Using the triangle inequality applied to the triangle with vertices (a,0), (0,5), 
(0,c) in the form | Va? + b? — Va? + ê| < |b — c| , we find that 
IVACEI + v(E,")? — VAE HEPI S lw) — vey) 
from which follows immediately that D tends to zero. 


PROBLEMS 


SECTION 2.1, Page 120 


1. Let f be a positive monotone function defined on [a, b], where0 <a < b. 
Let ¢ be the inverse of fand set « = f(a), P = f(b). Using the interpretation 
of integral as area show that 


B b 
KO dy = bf — ax -| fe dx. 


SECTION 2.2, Page 128 
1. Prove for any natural number p that 


[o dx = = (bP+1 — a”+!) 
Ay pti 


using a subdivision of [a, b] into cells of equal length. Employ the techniques 
in Chapter 1, miscellaneous Problems 5 to 12, to evaluate the approximating 


sums F,. ‘ 


2. Derive the formula for | + dx,a,b >0, when « is rational and 
a 


negative, say « = —r/s, where r and s are natural numbers. (Hint: Set 
qs = +7, where g = V bja.) 
3. By the method used to find the integral of sin x, derive the formula 


b 
[cos ax = sin b — sina. 
a 
a 
4. Make a general statement about Í f(x) dx when f(x) is (a) an odd 
function and (b) an even function. —a 
1/2 1/2 
5. Calculate sin x dx and | cos x dx. Explain on geometrical grounds 


0 0 
why these should be the same. Furthermore, explain why 


a+2r b+27 
Í sin x dx -f cos x dx 
a y 


for all values of a and b. 
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a 
6. (a) Evaluate J, = | at/" dx, What is lim I? Interpret geometrically. 


0 n— CO 


a 
(b) Do the same for J, = | x de: 
0 


7. Evaluate 


SECTION 2.3, Page 136 


*1. Cauchy's inequality for integrals. Prove that for all continuous functions 


f(), ge) 


b b b 2 
| EP de | [ge de > ( | flog) ar). 


*2. Prove that if f(x) is continuous and 


flr) = Í “F(0) dt, 
0 


then f(x) is identically zero. 
*3. Let f(x) be Lipschitz-continuous on [0, 1]; that is, 
If) -fl <M |x — yl 


for all x, y in the interval. Prove that 


r T k Z 
0 a n 2! n 2n 
SECTION 2.5, Page 145 
1. Prove 
PPT 
log ~ = (q <p). 
q Vv Pq 


(Hint: Apply Cauchy's inequality, Problem 1.) 


as | 
2. (a) Verify that log (1 + x) = | rane du, where x > —1. 
(b) Show for x > 0 that gi tu 


r2 
x Ses < log (1 + x) < x. 


*(c) More generally, show for 0 < x < 1 that 


ye 2 z2 2 98 pent 


r ue x 
r-5 t3 a PUSE eaS Cari 


(Hint: Compare 1/(1 + u) with a geometric progression.) 
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SECTION 2.6, Page 149 
1. (a) Prove 5 
Í e? dr =e — e 
a 
using a subdivision of [a,b] into equal cells. [Wint: Apply loga = 
lim n(v a — 1)]. 


n— © 


b 

(b) Find | log x dx. (See Section 2.1, Problem 1.) 
a 

(c) Show for x > 0 that 


n etyntl 


kiera ey a ne 
x — eae eee 4 v — a ee a 1 
z| Pa al E E E a a 1, 


T 
(Hint: Obtain upper and lower estimates for Í e“du and integrate 
repeatedly.) 0 

Obtain estimates of the same type for e” when œ < 0. 


SECTION 2.8c, Page 163 
Calculate the derivatives of the following functions wherever defined 
directly as the limits of their difference quotients. 
1. tan x, 
» sec? x. 
. sin Vir, 


. Vsin x, 
1 
sin wx 


An aN 


1 
6. sin - 
m 
7. x*, where « is rational and negative. 
SECTION 2.8i, Page 175 
1. Show x > sin x for positive x and x < tan x for x in (o, 5): 


2. If f(x) is continuous and differentiable for a < x < b, show that if 
fœ) <0 fora <x < § and f(x) > 0 for § <x < b, the function is never 
less than f(&). 

*3. If the continuous function f(x) has a derivative f’(x) at each point x in 
the neighborhood of x = &, and if f'(x) approaches a limit L as x — &, then 
f'© exists and is equal to L. 

*4. Let f(x) be defined and differentiable on the entire x-axis. Show that 
if f(0) = 0 and everywhere |f’(x)| < | f()|, then f(x) = 0 identically. 


SECTION 2.9, Page 184 


*1. lf a particle traverses distance 1 in time 1, beginning and ending at 
rest, then at some point in the interval it must have been subjected to an 
acceleration equal to 4 or more. 
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SUPPLEMENT, Page 192. Existence of the Definite Integral 
1. Let f(x) be defined and bounded on [a, b]. We define the upper sum £ 
and lower sum o for the subdivision 
a = to <7, L Tatt <x, =D 
to be 


X= > M, Ax,, a= > m,Ar,, 
i=1 i=] 
where M; is the least upper bound, m, the greatest lower bound of f(x) in 
the cell [v;_,, x,]. 

(a) Show that in any refinement of a subdivision the upper sum either 
decreases or remains unchanged and, similarly, the lower sum increases or 
remains unchanged. 

(b) Prove that each upper sum is greater than or equal to every lower sum. 

(c) The upper Darboux integral F* is defined as the greatest lower bound of 
the upper sums and the lower Darboux integral F~ as the least upper bound 
of the lower sums over all subdivisions. From (b), F' > F~. If Ft = F- we 
call the common value the Darboux integral of f. Prove that the Darboux 
integral of f is actually the ordinary Riemann integral; furthermore, show 
that the Riemann integral exists if and only if the upper and lower Darboux 
integrals exist and are equal. 


2. Let f(x) be a monotone function defined on [a, b). 

(a) Show that the difference between the upper and lower sums for a 
subdivision into n equal cells is given exactly by 

© — o = |f(b) — f b — a)n. 
and explain this result geometrically. 

(b) Use the result of (a) to prove that the Darboux integral exists. 

(c) Estimate & — a in terms of f(a), f(b) and the span of the subdivision 
if the cells of the subdivision may be unequal. 

(d) Mostly f(x), if not monotone, can be written as the sum of monotone 
functions, f(x) = (£) + y(x) where ¢ is nonincreasing and y is nondecreasing. 
Estimate the difference between the upper and lower sums in that case. 

3. Show that if f(x) has a continuous derivative in the closed interval 
[a, b], then f(x) can be written as the sum of monotone functions as in Problem 


MISCELLANEOUS PROBLEMS 


1. Prove that 
1 16 
(a) Í (2 ede = 2 
=i 


l Qnt1 n2 
5 9 (b) ( =x Df (x? — l )” de aN 2 (n 1) 
-1 


(2n + 1)!" 


2. Prove for the binomial coefficient (a) that 


n 3 | 5 
( ) z Ç +1) Í xk] — ayn dr! 
k ó 


*3. If f(x) possesses a derivative f(x) (not necessarily continuous) at each 
point x ofa < x < b,and if f(x) assumes the values m and M it also assumes 
every value u between m and M. 
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4. If f(x) > 0 for all values of x ina < x < b, the graph of y = f(x) lies 
on or above the tangent line at any point x = $, y = f (€) of the graph. 

S. If f(x) > 0 for all values of x in u < x < b, the graph of y = f(x) in 
the interval x, < x < z, lies below the line segment joining the two points of 
the graph for which « = 2, x = fọ. 


6. If f(x) > 0, then (=>) < Ta i 


*7. Let f(x) be a function such that f”(x) > 0 for all values of x and let 
u = u(t) be an arbitrary continuous function. Then 


| fas 1 f° 
z f flu@) dt > f (; f u(t) it). 


8. (a) Differentiate directly and write down the corresponding integration 
formulas: (i) 2%; (ii) tan z. 
(b) Evaluate 


lim 1 + sec? — re pail Tatie sec? — 
n— 90 4n án ån i 


9. Let f(x) have first and second derivatives for all real values of x. Prove 
that if f(x) is everywhere positive and concave, then f(v) is constant. 


3 


The Techniques of Calculus 


Part A Differentiation and Integration of the Elementary 
Functions 


3.1 The Simplest Rules for Differentiation and Their Applications 


Although problems of integration are usually of greater importance 
than those of differentiation, the latter offer less formal difficulty than 
the former. Therefore it is a natural procedure first to master the art 
of differentiating the widest possible classes of functions; then by the 
fundamental theorem (Section 2.9) the results of differentiation are 
available for evaluating integrals. In the following sections we shall 
pursue such applications of the fundamental theorem. To a certain 
extent we shall make a fresh start and develop techniques of integration 
systematically on the basis of certain general rules for differentiation. 


a. Rules for Differentiation 
We assume that in the interval under consideration the functions 


f(x) and g(x) are differentiable; then the following rules are basic. 


Rule 1. Multiplication by a Constant. For any constant c, the 
function d(x) = cf(x) is differentiable, and 


(1) $ (z) = f'(x). 


The obvious proof was given in Chapter 2, p. 165. 
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Rule 2. Derivative of a Sum. If $(x) = f(x) + g(x), then d(x) is 
differentiable and 


(2) P(x) = f'(x) + g'(x); 


that is, the operations of differentiation and addition are interchange- 
able. The same holds for the sum of a finite number n of differentiable 
functions 


$2) = > A), 


for which we obtain 
POSE KO. 


The proof is obvious from the definition of derivative. 


Rule 3. Derivative of a Product. If ¢(x) = f(x)g(x), then (x) is 
differentiable and 


(3) p(x) = fg E) + gef E). 
The proof follows from the equation 


glx + h) — g(r) _ fle + Wele + h) — f(a)g(2) 
h h 


Taking the limit in this expression as h — 0 yields Eq. (3). 
This formula becomes more elegant if we divide! by (x) = 
Sf (a)g(x). We then obtain 


ORLONO 
P(x) f(x) g(x) 


Using the notation of differentials (Chapter 2, p. 179) we may also 
rewrite Eq. (3) as 
aU fg) = f dg + g df. 


By induction we obtain for the derivative of a product of n factors 
an expression consisting of n terms, each of which consists of the 
derivative of one factor multiplied by all the other factors of the 


1 We must, of course, assume that ġ(x) is nowhere equal to zero. 
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original product: 
E) = L KOOL) 
xX 


m SAE) H AOS ASE) $ Sala) 
+ ASAE) Sa (E) 
_ $x) 
Ère reas F(a)’ 
or on division by d(x) = fi(x) f(x) fa) 
pa) f) , fi’) 4 fal®) 5 fel 
re T T 
Px) AE) fe) The) ey 
which is valid where ġ(x) # 0. 
By repeated application of the rule for the derivative of a product we 


can obtain formulas for the second and higher derivatives as well. We 
have for the second derivative 


fe _ 4 (dle) t dg , af 
dx? dx\dz dx Van dn? 


= VE) tale) 

dx Tia g E 

d’g df dg df 
Spe ey ees pE y 
j 4 dede de" 


Leibnitz’s Rule. The reader should prove by induction that the nth 
derivative of a product may be found according to the following rule 
Ea ae 


n\ df d”™'g 
E paytta (0) A 
i = Ue ) js 1/ dx dx” 
n\ d*fd"*g ( n ) d’"fdg | df 
eh A a Gar? re aae + AN : 
j e) dx? dx"? i = n — 1/ dx” dx dæ” Ê 


Here (") = N, () = [n(n — 1)]/2!, etc., denote the binomial co- 


efficients. 


Rule 4. Derivative of a Quotient. For a quotient 


Jala 
ge) = 42 
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the following rule holds: The function (x) is differentiable at every 
point at which g(x) does not vanish, and 


(4) p(y EO oy). 


[e(x)]° 
If d(x) Æ 0, this can be written as 


Po Le) _ g'e) 
d(x) fæ) ga) 
PROOF. If we assume the differentiability of ¢(x), we can apply the 
product rule to f(x) = (x)g(x) and conclude 


F(x) = pag’) + gœ). 


By substituting f(x)/g(x) for (x) on the right and solving for ¢'(x), 
we obtain Rule 4. 


We can prove the differentiability of d(x) as well as the rule if we 
write 


fie@t+h) fæ) 
p(x + h) — H(z) gath g(x) 
h h 


(of EA WA SO) ENEO 


g(x)g(x + h) 
If we now let h tend to zero we arrive at the result stated; for by 
hypothesis the denominator does not tend to zero but to the limit 
[g(x)]?, and the two terms of the numerator have limits g(x) f'(x) and 
g'(x) f(x), respectively. This proves both the existence of the limit 
on the left side and the differentiation formula. 


b. Differentiation of the Rational Functions 
First, we derive once more the formula 


—2" = nr"! 


dx 
for every positive integer n, invoking the rule for differentiating a prod- 
uct. We think of z” as a product of n factors, x” = x ++ x, and thus 


obtain 
d nagog eee ie m na, 


dx 
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The second derivative of the function z” follows from this formula and 
eq. (1)° 


— x" = n(n — 1)". 


Continuing, we obtain the higher derivatives 


d? 
Ta x” = n(n — 1)(n — 2)x”~? 


From the last of these formulas it is clear that the nth derivative of 
x” is a constant, whereas the (n + 1)th derivative vanishes everywhere. 

By using our first two rules and the rule for differentiating powers we 
can differentiate any polynomial y = a + az + ax? +--+ + 4,2", 
obtaining 


y' = a, + 2X + 3a5x? + eres + na,«"—}; 
furthermore, 
y" = 2a +3. 2X +4- 3a,x? ++ n(n _ 1a, 2-2, 


and so on. 

The derivative of any rational function can now be found with the 
help of the quotient rule. In particular, we again deduce the differen- 
tiation formula for the function x", where n = —m is a negative integer. 
Application of the quotient rule, together with the fact that the derivative 
of a constant is equal to zero, gives the result 


(d/dx)(A/z™) = —ma™ 1/22" = —m/[x™7), 


or, if we take m = —n, 
d n-1 


— x pe 


dx 


which agrees formally with the result for positive values of n and with 
the results given earlier (p. 164). 


c. Differentiation of the Trigonometric Functions 


For the trigonometric functions sin x and cosx we have already 
obtained (p. 165) the differentiation formulas 


d . d l 
— sin z = cos x and — cos z = —sin z. 
dx dx 
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The quotient rule now enables us to differentiate the functions 
sin x c 

and y = cot xr = i; 

COS x sin x 


According to the rule, the derivative of the first of these functions is 


y = tan t = 


TE cos’ x + sin? z _ 1 
a 2 i 2? 
cos? x cos? x 
so that 
— tan z = = sec? x = 1 + tan? z. 
dx cos“ x 


Similarly, we obtain 


d 1 
— cot x = — —— = — cosec? z = —(1 + cot” 2). 
dx sin“ x 
To the differentiation formulas for sinz, cos z, tan z, and cotz 
correspond the following integration formulas: 


[co x dx = sin 2, [sin x dx = — cos x, 


1 
f dx = tan z, |= dx = — cot x. 
cos’ x sin“ x 


From these formulas we obtain by way of the fundamental rule of 
Section 2.9, p. 190 the value of the definite integral between any 
limits, the only restriction being that when the last two formulas are 
used, the interval of integration must not contain any point of dis- 
continuity of the integrand such as an odd multiple of 7/2 in the first 
case, and an even multiple of 7/2 in the second. For example, 


b b 
[ cos zaz = sin 2 = sin b — sina. 
a 


a 


3.2 The Derivative of the Inverse Function 
a. General Formula 


We have seen on p. 45 that a continuous function y = f(x) has a 
continuous inverse in every interval in which it is monotonic. Precisely: 


Ifa <x <b isan interval in which the continuous function y = f(x) 
is monotonic, and if f (a) = « and f(b) = $, then f has an inverse function: 
which in the interval between « and B is continuous and monotonic. 


As pointed out on p. 177, the sign of the derivative provides a simple 
test for seeing when a function is monotonic and therefore has an 
inverse. A differentiable function is continuous, and is monotonic 
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increasing in an interval where f(x) is greater than zero, and monotonic 
decreasing in an interval for which f'(x) is everywhere less than zero. 

We shall now characterize the derivative of the inverse function by 
proving the following theorem. 


THEOREM. If the function y = f(x) is differentiable in the interval 
a<u<b, and either f'(x) > 0 or f(x) <0 throughout the interval, 
then the inverse function x = $(y) also possesses a derivative at every 
interior point of its interval of definition: the derivatives of y = f(x) 
and of its inverse x = ¢(y) satisfy the relation f'(x): ¢'(y) =1 at 
corresponding values x, y. 


This relation can also be put in the form 


(5) oy 
dx dx 
dy 

This last formula again illustrates the suitability of Leibnitz’s notation: 
the symbolic quotient dy/dx can be treated in formulas as if it were 
an actual fraction. 

PROOF. The proof of the theorem is simple. Writing the derivative 
as the limit of a difference quotient, we have 

pergi Sin 
Az70 AY zı >æ Ly — T 
where x and y = f(x), and x, and y, = f(x), respectively denote pairs 
of corresponding values. By hypothesis the first of these limiting values 
is not equal to zero. Because of the continuity of y = f(x)and x = ¢(y), 
the relations y, — y and x, — x are equivalent. Therefore the limiting 
value 
lim 2 = lim 2 
mre yy — Y nowy y 

exists and is equal to 1/f’(x). On the other hand, the limiting value on 
the right-hand side is by definition the derivative ¢’(y) of the inverse 


function ¢(y), and our formula is proved. 


The simple geometrical meaning of the formula is clearly shown in 
Fig. 3.1. The tangent to the curve y = f(x) or z = p(y) makes an 
angle « with the positive x-axis, and an angle f with the positive y-axis; 
from the geometrical interpretation of the derivative of a function as 
the slope of the tangent 


f(x) =tana, 9 (y) = tan Ê. 
Since the sum of the angles « and £ is 7/2, tan « tan f = 1, and this 
relationship is exactly equivalent to our differentiation formula. 
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Figure 3.1 Differentiation of the inverse function. 


Critical Points 


We have hitherto expressly assumed that either f'(x)>0 or 
f'(x) < 0, that is, that f'(x) is never zero. What, then, happens if 
f(x) = 0? If f'(x) = 0 everywhere in an interval, then f is constant 
there, and consequently has no inverse because the same value of y 
corresponds to all values of x in the interval. If f(z) =0 only at 
isolated ‘‘critical” points (and if f'(x) is assumed continuous), then we 
have two cases, according to whether on passing through these points 
f'(x) changes sign, or not. In the first case this point separates a point 
where the function is monotonic increasing from another where it is 
monotonic decreasing. In the neighborhood of such a point there 
can be no single-valued inverse function. In the second case the 
vanishing of the derivative does not contradict the monotonic character 
of the function y = f(x), so that a single-valued inverse exists. How- 
ever, the inverse function is no longer differentiable at the corre- 
sponding point; in fact, its derivative is infinite there. The functions 
y = x? and y = z? at the point x = 0 offer examples of the two types. 
Figure 3.2 and Fig. 3.3 illustrate the behavior of the two functions 
upon passing through the origin and at the same time show that the 
function y = zx? has a single-valued inverse, whereas the other function 
y = x* does not. 
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Figure 3.2 Parabola. 
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b. The Inverse of the nth Power: the nth Root 


The simplest example is the inverse of the function y = x" for 
positive integers n; at first we assume positive values of x, hence also 
y > 0. Under these conditions y’ is always positive, so that for all 
positive values of y we can form the unique inverse function 


c= Wy = yl”, 
The derivative of this inverse function is immediately obtained by the 
above general rule as follows: 


diy") dx 1 1 1 1 tas 


dy dy” dylde a ny g 
If we now change the notation and denote again the independent 
variable by x, we may finally write 


dwax d 1 


See a”) oS "i a 
dx dx 
which agrees with the result obtained on p. 164. 

For n > 1, the point z = 0 requires special consideration. If z 
approaches zero through positive values, d(x'/")/dx will obviously 
increase beyond all bounds; this corresponds to the fact that for 
n > 1 the derivative of the nth power f(x) = x” vanishes at the origin. 
Geometrically, this means that the curves y = z!” for n > 1 touch the 
y-axis at the origin (cf. Fig. 1.35, p. 48). 

It should be noted that for odd values of n the assumption x > 0 
can be omitted and the function y = 2” is monotonic and has an 
inverse over the entire domain of real numbers. The formula 

avy) 
dy 
still holds for negative values of y, but for z = 0, n > 1, we have 
d(x")/dx = 0, which corresponds to an infinite derivative dz/dy of the 
inverse function at the point y = 0. 


= (1/nyy 1- 


c. The Inverse Trigonometric Functions—Multivaluedness 


To form the inverses of the trigonometric functions we once again 
consider the graphs? of sin z, cos z, tan x, and cot x. We see at once 
from Figs. 1.37, p. 50 and 1.38, p. 51, that for each of these functions it 


1 The graphical representation will help the reader to overcome the slight difficulties 
inherent in the discussion of the ‘‘multivaluedness’’ of the inverse functions. 
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is necessary to select a definite interval if we are to speak of a unique 
inverse; for the lines y = c parallel to the z-axis cut the curves in an 
infinite number of points, if at all. 


The Inverse Sine and Cosine 


For the function y = sin x, for example (Fig. 3.4), the derivative 
y’ = COS x is positive in the interval —7/2 < x < 7/2. In this interval 
y = sin v has an inverse function which we denote by! 


x = arc sin y 


1 y=sinx 
Fe NS x= arc sin y 


Figure 3.4 Graph of y = sin x (principal value indicated by solid curve). 


(read arc sine y; this means the angle whose sine has the value y). 
This function increases monotonically from —7/2 to +7/2 as y 
traverses the interval —1 to +1. If we wish to emphasize that we are 
considering the inverse function of the sine in this particular interval, 
we speak of the principal value of the arc sine. For some other interval 
in which sin x is monotonic, for example, the interval +7/2 < x < 37/2, 
we obtain another inverse or ‘“‘branch”’ of the arc sine; without the 
exact statement of the interval in which the values of the inverse 
function should lie, the symbol arc sine means not one well-defined 
function but, in fact, denotes an infinite number of values.” 

The multivaluedness of arc sin y is described by the statement: To 
any one value y of the sine there corresponds not only a specific angle x 
but also any angle of the form 2km + x or (2k + 1)m — x, where k is 
any integer (cf. Fig. 3.4). 


1 The symbolic notation x = sin™! y is also used where there is no danger of con- 
fusion with the reciprocal function 1/sin x. 
2? Sometimes loosely called a multiple-valued function. 
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\\y = are sin x 


x=SINy 


Figure 3.5 Graph of y = arc sin v (principal value indicated by solid curve). 


The derivative of the function x = arc sin y is obtained from Eq. (5) 
as follows: 


dx 1l l 1 1 


— een ee ee a 


where the square root is to be taken as positive if we confine ourselves 
to the first interval mentioned, that is, —7/2 < x < 7/2.) 

Finally, we change the name of the independent variable from y to 
the commonly used x (Fig. 3.5); then the derivative of arc sin = is 


1 If instead of this we had chosen the interval 7/2 < x < 37/2, corresponding to the 
substitution of z + 7 for z, we should have had to use the negative square root 
since cos x is negative in this interval. 
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expressed by 


: 1 
—arcsnz = i 
dx 1— x 


Here it is assumed that arc sine is the principal value which lies between 
—7/2 and +7/2, and the square root sign is chosen positive. 


Figure 3.6 Graph of y = arc cos x (principal value indicated by solid curve). 


For the inverse function of y = cos x, denoted (after again inter- 
changing the names x and y) by arc cos x, we obtain the formula 


d 1 
— arc cos t = = -= 


dx J1— 2 


in exactly the same way. Here we take the negative sign of the root if the 
value of arc cos x is taken in the interval between 0 and ~ (not, as in the 
case of arc sin x, between —7/2 and +7/2) (cf. Fig. 3.6). 
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The derivatives become infinite on approaching the end points 
xz = —l and z = +1, corresponding to the fact that the graphs of the 
inverse sine and inverse cosine have vertical tangents at these points. 


Inverse Tangent and Cotangent 


We treat the inverse functions of the tangent and cotangent in an 
analogous way. The function y = tanz, having an everywhere 
positive derivative 1/cos? x for x A m/2 + kr, has a unique inverse 


Figure 3.7 Graph of y = arctan (solid curve for principal value). 


in the interval —7/2 < x < 7/2. We call this inverse function (the 
Principal Branch of) x = arc tan y. We see at once from Fig. 3.7 that 
for each x we could have chosen instead of y any of the values y + kr 
(where k is an integer). Similarly, the function y = cot x has an inverse 
x = arc cot y which is uniquely determined if we require that its value 
shall lie in the interval from 0 to 7; otherwise the many-valuedness of 
arc cot x is the same as for arc tan zx. 
The differentiation formulas are as follows: 


dx 1 2 1 1 
x = arc tan y, — = = cos’ ¢ = ——— = : 
dy dyldz 1+tan°x 1+y 
dx ie 1 1 
x = arc cot y, — = — sin r = — — 7 =F — 


dy l4+co?x 1+7 
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or finally, if we denote again the independent variable by z, 


Carrer : 
dx 1+ 2” 
d 1 
— arc cot xz = — =: 
dx 1+ 2° 


d. The Corresponding Integral Formulas 


Expressed in terms of indefinite integrals, the formulas which we 
have just derived are written as follows: 


1 1 
Gow dx = arc sin 2, —= dx = — arc cos 2, 
— x 


1 Vi — r 


| z dx = arc tan x, f l -dx = — arc cot z. 
l+u2x 1+ x 


Although the two formulas on each line express different functions by 
identical indefinite integrals, they do not contradict each other. In 
fact, they illustrate what we learned earlier (see Section 2.9), that all 
indefinite integrals of the same function differ only by constants; here 
the constants are 7/2 since arc cos v + arc sin x = 7/2, arc tan x + 
arc cot x = 7/2. 

The formulas for indefinite integrals may immediately be put to use 
for finding definite integrals, as on p. 143. In particular, 


f dx 
5 = arctan x 
al+az2z 


b 
= arctan b — arc tan a. 


a 


If we put a = 0, b = 1 and recall that tan 0 = 0 and tan 7/4 = 1, we 
obtain the remarkable formula 


1 
1 
6 zf dz. 
(6) 4 iiaa a 


The number v, which originally arose from the consideration of the 
circle, is brought by this formula into a very simple relationship with 
the rational function 1/(1 + x”), and represents the area indicated in 
Fig. 3.8. This formula for 7, to which we shall return later (p. 445), 
constitutes one of the early triumphs of the power of calculus. 

More generally, the integral formulas of this section permit us to 
define the trigonometric functions purely analytically, without any 
reference to geometric objects such as triangles or circles. For example, 
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the relation between an angle y and its tangent z = tan y is com- 
pletely described by the equation 


y=" du 
ol +u’ 


(at least for —7/2 < y < 7/2). With this relation we may now define 
without appeal to intuition a numerical value for the angle y in a right 
triangle with sides a (adjacent) and b (opposite) for which b/a = zx. 


Figure 3.8 7/2 illustrated by an area. 


Such an analytic definition in terms of numerical quantities makes the 
use of angles and trigonometric functions legitimate in higher analysis 
irrespective of a definition by geometrical construction. 


e. Derivative and Integral of the Exponential Function 


In Chapter 2, p. 150, we introduced the exponential function as the 
inverse of the logarithm. Precisely speaking the relations y = e* and 
x = logy were thus defined to be equivalent. Consequently their 
derivatives satisfy the relation [see (5) p. 207] 


dy dy y 


Hence the exponential function is equal to its derivative: 
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More generally, for any positive a the function y = a” has as its 
inverse 


z = log, y = 284, 
log a 
and the derivative of a is 
da” 1 
— = = (log a)y = (lo ya 
e a (log a)y = (log a)a 
dy 


Thus for any positive constant a the derivative of the function y = a” 
is proportional to the function itself. The factor of proportionality 
log a is | when a is the number e. On p. 223 we shall show conversely 
that any function which is proportional to its derivative must be of the 
form y = ce”, where c denotes a constant factor. 

By the fundamental theorem of calculus we can again translate the 
formulas for the derivatives of e* and of a” into formulas for indefinite 
integrals: 


3.3 Differentiation of Composite Functions 


a. Definitions 


The preceding rules allow us to find the derivatives of functions that 
are obtained as rational expressions in terms of functions with already 
known derivatives. To find explicit expressions for the derivatives of 
other functions occurring in analysis we must go further by deriving a 
general rule for the differentiation of composite or compound functions. 
We are confronted quite often with functions f(x) built by the process 
of composition of simpler ones (see Chapter 1, p. 52): f(x) = g(¢(x)), 
where ¢(z) is defined in a closed interval a < x < 6 and has there the 
range ø < $ < p, and where g(¢) is defined in this latter interval. 

In this connection it is useful to remember the interpretation of 
functions as ‘“‘operators” or mappings. As in Chapter 1, we write 
the composite function simply as 


t= ge 
and call gġ the (symbolic) “product” of the operators or mappings 
g and ¢. 
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b. The Chain Rule 


For functions g and ġ which are continuous in their respective 
intervals of definition the compound function f(x) = g[¢(zx)] is con- 
tinuous as well (see Chapter 1, p. 55). 

The functions ¢(x) and g(¢) are now assumed to be not only contin- 
uous but differentiable. We then have the following fundamental 
theorem, the chain rule of differentiation: 


The function f(x) = g[$(x)] is differentiable, and its derivative is given 
by the equation 


(7) S Œœ) = g'($): p'e), 


or, in Leibnitz’s notation, 


af _ Sde 
dx dọ dz` 


Therefore the derivative of a compound function is the product of the 
derivatives of its constituent functions. Or: The derivative of the symbolic 
product of functions is the actual product of their derivatives with respect 
to their corresponding independent variables. 


Intuitively, this chain rule is very plausible. The quantity ¢(z) = 
lim Ad/Az is the local ratio in which small intervals are magnified by 
the mapping ¢. Similarly, g’(¢) is the magnification given by the 
mapping g. Applying first @ and then g results in magnifying an 
x-interval first ¢’fold, and then enlarging the resulting ¢-interval g’fold, 
resulting in a total magnification ratio of g’¢’ which must be the mag- 
nification ratio for the composite mapping f = gẹ. 

The theorem follows very easily from the definition of the derivative. 
In fact, it becomes intuitively almost obvious if we assume ¢(x) ¥ 0 
in the closed x-interval under consideration. Then for Ax = x, — x, # 0 
we have by the mean value theorem 


Ad = h: — fı = lz) — f(z) = P(E)Ax #0 with z, <È < x, 
and, with Ag = g(¢.) — g(¢,) and Af = f(z) — f(x), we may write 


Af _ AgAg 
Az Ad¢gAz 
which is a meaningful identity because Ad # 0. Now Ad — 0 for 


Ax — 0, that is, for z,—2,; therefore for Ax-—-0O the difference 
quotients tend to the respective derivatives and the theorem is proved. 
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To avoid the explicit assumption ¢’(x) = 0 we can dispense with the 
division by A¢ in the following slightly more subtle manner: 

From the assumption of differentiability of g(¢) at the point ¢ we 
know that the quantity « = Ag/A¢ — g’(¢) as a function of A¢ for 
fixed $ and Ad # 0 has the limit zero for Ad — 0. If we define « = 0 
for Ad = 0, we have without restriction on Ad 


Ag = [g'(#) + «] Ad. 
Similarly for fixed x, 
Ad = (x + Ax) — $(z) = [6'(x) + n] Ar, 
where lim 7 = 0. Then for Arx # 0 and ¢ = ¢(2), 
Az+0 i 


ke 


e EE =e E E 


For Ax tending to zero through nonzero values we have lim Aġ = 0 and 


hence lim «e = 0, so that Az—0 
Az—0 
ee ae l , l 
lim £ = lim [g'(4) + «]lim [4(2) + 7] = aH $(O). 
Az=0 AX Az—0 Az>0 


which proves the chain rule. 

By successive application of our rule we immediately extend it to 
functions arising from the composition of more than two functions. 
If, for example, 

y=gtu), u= Pv), v= (2), 
then y = f(x) = g[d(y(x))] is a compound function of x; its derivative 
is given by the rule 
dy du dv 


dy lá 1 $ 1 
— = y' = g'(u)$'(vjy' (x) = =: —: — ; 
a g'(u)p' (vy (x) FFE 


similar relations are true for functions that are compounded of an 
arbitrary number of functions. 


Higher Derivatives of a Composite Function. y = g[¢(«)] can be found 
easily by repeated application of the chain rule and the preceding rules: 


dy dọ 
y” =g"? +8 t", 
y” = g"? + 3gp h” + 9'¢". 


Analogous formulas for y”’ etc., can be derived successively. 


4 


yY =g %, 


220 The Techniques of Calculus Ch. 3 


Finally, let us examine the composition of two functions inverse to each 
other. The function g(y) is the inverse of y = (x) if f(x) = g[d¢(x)] = 
x. It follows that 


f(x) =g) = 1 
which is exactly the result of Section 3.2, p. 207. 


Examples. As a simple but important example of an application of 
the chain rule we differentiate x* (x > 0) for an arbitrary real power a. 
In Chapter 2, p. 152, we defined 


a alogz. 
? 


t =e 


we also proved for ġ(x) = log z, y(u) = au, g(y) = e” that 
1 j 
¢'(z) = p> PUS BOS: 


Now 2? is the compound function g{y[¢(x)]}. Applying the chain rule 
we obtain the general formula 


E a) = g'(y)y'(u)g'(2) 


hence 


A (a) = a, 
dx 
a result we could prove only with some difficulty had we attempted 
to proceed directly from the definition of 2* for irrational « as the limit 
of powers with rational exponents. 
An immediate consequence of this differentiation is, again, the 
integral formula 


ax 


gett 
[x dx = (a # —1). 
+1 


As a second example, we consider 


y=Vl—2 or y= V4, 
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where ¢ = 1 — z? and —1 <x < 1. The chain rule yields 


Further examples are given by the following brief calculations. 


1. y = arc sin V1 — 2°, (—1 <x <1, zx #0). 
dy l dJ1 — r? 


~ lel VI — 2 = ee 


2. je (-1<2<1). 


l-r 


l+ 
a a E 
dx 5 fits dx 
l— rz 


Jl—z 2 | 


Soon  _ ee - 


2/l+2 (1-2 (1+2)*(1—2)4 | 


3. y = log |z|. This function’ can be expressed as log x for x > 0 
and as log (—z) for x <0. Forz>0 


d log |x| [x] d log x 1 


dx dx “r 


For x < 0 we obtain from the chain rule that 


dx dx —x dx ţ 
Hence generally for x # 0 
dlog|z|__ 1 
dx x 


4. y =a’. By definition of a” (see p. 152) we have 


a? = ere) 


1 The function log z is defined only for x > 0, whereas log |z| is defined everywhere 
except for x = 0. 
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where ¢(x) = (log a)z. Then 


da? _de*dd_ 4 

— = — — = eI = (| j 

o dii e“(log a) = (log a)a 
The same result was obtained already on p. 217 from the rule for 
the derivative of the inverse function. 


5. y = [f(x)]’™. Since 
APA e. eb 
with P(x) = g(x) log [f(«)], we find 


TUO = egio f+ 247’) 


= [f(xy g'(x xy) + SFO) 
[/(2)] (eog 1 i ). 


For example, when g(x) = f(x) = x we have 


a = x*(log x + 1). 
dx 


c. The Generalized Mean Value Theorem 
of the Differential Calculus 


As an application of the chain rule we derive the generalized mean value 
theorem of differential calculus. Consider two functions F(x) and G(x), 
continuous on a closed interval (a, b] of the x-axis, and differentiable on the 
interior of that interval. We assume that G’(2) is positive. The ordinary 
mean value theorem of differential calculus applied separately to F and G 
F(b) — F(a) 

Gor Ga) ` 
F(b)— F(a) _ F'(EXb —a) _ FS) 
G(b) — Gla) Gib -— a) Gn)’ 
where & and » are suitable intermediate values in the open interval (a, b). 
The generalized mean value theorem states that we can write the difference 
quotient in the simpler form 
F(b) — F(a) FO 
G(b) — Gla) GQ)’ 
where F’ and G’ are evaluated at the same intermediate value ¢. 

For the proof we introduce u = G(x) as an independent variable in F. 

From the assumption G’ > 9 we conclude that the function u = G(x) is 


monotonic in the interval [a, b], and hence that it has an inverse x = g(u) 
defined in the interval [«, F], where « = G(a), 8 = G(b). The compound 


furnishes an expression for the difference quotient 


Sec. 3.4 Some Applications of the Exponential Function 223 


function F[g(u)] = f(u) is therefore defined for u in the interval [«, 8]. From 
the ordjnary mean value theorem we find that 
F(b) — F(a) = f) — f) =f — œ) =f WIG) — Gla), 
where y is a suitable value between « and $. By the chain rule, we infer 
F'(x) 
G‘(2). 


To the value u = y there corresponds a value x = p(y) = ¢ in the interval 


(a, b). Then f(y) = F(¢)/G(¢), and the generalized mean value theorem 
follows. 


fu) => SF [e(u)) = Fle(wlg’(u) = 


3.4 Some Applications of the Exponential Function 


Some miscellaneous problems involving the exponential function will 
illustrate the fundamental importance of this function in all sorts of 
applications. 


a. Definition of the Exponential Function 
by Means of a Differential Equation 


We can define the exponential function by a simple property, whose 
use obviates many detailed arguments in particular cases. 


If a function y = f(x) satisfies an equation of the form 
y = ay, 
where a is a constant, then y has the form 


(8) y = f(x) = ce”, 
where c is also a constant; conversely, every function of the form ce™* 
satisfies the equation y' = ay. 


Since Eq. (8) expresses a relation between the function and its de- 
rivative, it is called the differential equation of the exponential function. 

It is clear that y = ce% satisfies this equation for any arbitrary 
constant c. Conversely, no other function satisfies the differential 
equation y’ — ay = 0. For if y is sucha function, we consider the 
function u = ye~**. We then have 


agr ax 


—aye "=e “(y' — ay). 
However, the right-hand side vanishes, since we have assumed that 
y’ = ay; hence u’ = 0, so that by p. 178 u is a constant c and y = 
ce"? as we wished to prove. _ 

We shall now apply this theorem to a number of examples. 


u’ = y’e 
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b. Interest Compounded Continuously. 
Radioactive Disintegration 


A capital sum, or principal, augmented by its interest at regular 
periods of time, increases by jumps at these interest periods in the 
following manner. If 100« is the percent of interest, and further- 
more, if the interest accrued is added to the principal at the end of each 
year, after x years the accumulated amount of an original principal of 1 
will be 

(1 + a)”. 


If, however, the principal had the interest added to it not at the end of 
each year, but at the end of each nth part of a year, after x years the 
principal would amount to 

(1 + =) 
n 


Taking x = 1 for the sake of simplicity, we find that the principal 1 has 
increased after one year to 
(1 + 2) | 
n 


If we now let n increase beyond all bounds, that is, if we let the interest 
be credited at shorter and shorter intervals, the limiting case will mean 
in a sense that the compound interest is credited at each instant; then 
the total amount after one year will be e" times the original principal 
(see p. 153). Similarly, if the interest is calculated in this manner, an 
original principal of 1 will have grown after x years to an amount 
e**; here x may be any number, integral or otherwise, 

The discussion in Section 3.4a forms a framework into which examples 
of this type are readily fitted. We consider a quantity, given by the 
number y, which increases (or decreases) with time so that the rate at 
which this quantity increases or decreases is proportional to the total 
quantity. Then with time as the independent variable x, we obtain a 
law of the form y’ = ay for the rate of increase, where «, the factor of 
proportionality, is positive or negative depending on whether the 
quantity is increasing or decreasing. Then in accordance with Section 
3.4a the quantity y itself is represented by a formula 


y= ce”, 


where the meaning of the constant c is immediately obvious if we 
consider the instant x = 0. At that instant e°" = 1, and we find that 
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c = yy is the quantity at the beginning of the time considered, so that 
we may write 
Y = ye”. 

A characteristic example is that of radioactive disintegration. The 
rate at which the total quantity y of the radioactive substance is 
diminishing is proportional at any instant to the total quantity present 
at that instant; this is a priori plausible, for each portion of the sub- 
stance decreases as rapidly as every other portion. Therefore the 
quantity y of the substance expressed as a function of time satisfies a 
relation of the form y’ = —ky, where k is to be taken as positive since 
we are dealing with a diminishing quantity. The quantity of substance 
is thus expressed as a function of the time by y = yge~**, where yo 
is the amount of the substance at the beginning of the time considered 
(time z = 0). 

After a certain time 7 the radioactive substance will have diminished 
to half its original quantity. This so-called half-life is given by the 
equation 

Yo = Yor” 
from which we immediately obtain 7 = (log 2)/k. 


c. Cooling or Heating of a Body by a Surrounding Medium 


Another typical example of the occurrence of the exponential function 
is the cooling of a body, for example, a metal plate of uniform temper- 
ature which is immersed in a very large bath of lower temperature. We 
assume that the surrounding bath is so large that its temperature is 
unaffected by the cooling process. We further assume that at each 
instant all parts of the immersed body are at the same temperature, 
and that the rate at which the temperature changes is proportional to 
the difference of the temperature of the body and that of the surrounding 
medium (Newton’s law of cooling). 

If we denote the time by x and the temperature difference between 
the body and the bath by y = y(x), this law of cooling is expressed by 
the equation 
y = —ky, 
where k is a positive constant (whose value is a physical characteristic 
of the substance of the body). From this differential equation, which 
expresses the effect of the cooling process at a given instant, we obtain 
by means of Eq. (8), p. 223, an “integral law” giving us the temperature 
at any arbitrary time x in the form 


y = ce™, 
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This shows that the temperature decreases “exponentially” and tends 
to become equal to the external temperature. The rapidity with which 
this happens is expressed by the number k. As before, the meaning of 
the constant c is that of the initial temperature at the instant x = 0, 
Yo = c, so that our law of cooling can be written in the form 


Y = ye ™. 
Obviously, the same discussion applies also to the heating of a body. 


The only difference is that the initial difference of temperature yọ is 
in this case negative instead of positive. 


d. Variation of the Atmospheric Pressure with 
the Height above the Surface of the Earth 


A further example of the occurrence of the exponential formula is in 
the variation of atmospheric pressure with height: We make use of (1) 
the physical fact that the atmospheric pressure is equal to the weight 
of the column of air vertically above a surface of area one, and (2) of 
Boyle’s law, according to which the pressure p of the air at a given 
constant temperature is proportional to the density o of the air. Boyle’s 
law, expressed in symbols, ts p = ao, where a is a constant depending 
on a specific physical property of the air. Our problem is to determine 
p =/(A) as a function of the height h above the surface of the earth. 

If by pọ we denote the atmospheric pressure at the surface of the 
earth, that is, the total weight of the air column supported by a unit 
area, by g the gravitational constant, and by o(A) the density of the air 
at the height A above the earth, the weight’ of the column up to the 


h 
height h is given by the integral g Í o(å) dì. The pressure at height A 
is therefore j 


h 
p = f(h) = po — e| a(A) d}. 
By differentiation this yields the following relation between the pressure 
p = f (h) and the density o(h): 
go(h) = -f (h) = —p’. 
We now use Boyle’s law to eliminate the quantity o from this equation, 


thus obtaining an equation p’ = —(g/a)p which involves the unknown 
pressure function only. From Eq. (8) p. 223, it follows that 


p =f th) = ce. 


1 go(A) is the weight of the air per unit volume at the height A. 
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If as above we denote the pressure f (0) at the earth's surface by Po, it 
follows immediately that c = pọ, and consequently 


p=fh) = poe”. 
Taking the logarithms yields 


ket log—. 
£ p 

These two formulas are applied frequently. For example, if the constant 
a is known, they enable us to find the height of a place from the baro- 
metric pressure or to find the difference in height of two places by 
measuring the atmospheric pressure at each place. Again, if the atmo- 
spheric pressure and the height A are known, we can determine the 
constant a, which is of great importance in gas theory. 


e. Progress of a Chemical Reaction 


We now consider an example from chemistry, the so-called uni- 
molecular reaction. We suppose that a substance is dissolved in a 
large amount of solvent, say a quantity of cane sugar in water. lf a 
chemical reaction occurs, the chemical law of mass action in this case 
states that the rate of reaction is proportional to the quantity of 
reacting substance present. We suppose that the cane sugar is being 
transformed by catalytic action into invert sugar, and we denote by 
u(x) the quantity of cane sugar which at time x is still unchanged. 
The velocity of reaction is then —du/dx, and in accordance with the 
law of mass action an equation of the form 


du 
dx 


= —ku 


holds, where k is a constant depending on the substance reacting. 
From this instantaneous or differential law we immediately obtain, as 
on p. 223, an integral law, which gives us the amount of cane sugar as 
a function of the time: 

u(x) = ae~**, 


This formula shows us clearly how the chemical reaction tends asymp- 
totically to its final state u = 0, that is, complete transformation of the 
reacting substance. The constant a is obviously the quantity of cane 
sugar present at time x = Q. 
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f- Switching an Electric Circuit On or Off 


As a final example we consider the growth of a direct electric current 
when a circuit is completed, or its decay when the circuit is broken. 
If R is the resistance of the circuit and E the electromotive force (volt- 
age), the current J gradually increases from its original value zero 
to the steady final value E/R. We have therefore to consider J as a 
function of the time x. The growth of the current depends on the self- 
induction of the circuit; the circuit has a characteristic constant L, 
the coefficient of self-induction, of such a nature that, as the current 
increases, an electromotive force of magnitude L dI/dx, opposed to the 
external electromotive force E, is developed. From Ohm’s law, as- 
serting that the product of the resistance and the current is at each 
instant equal to the actual effective voltage, we obtain the relation 


For E 
f(z) = (2) — = 


we immediately find f'(x) = —(R/L) f(x), so that by Eq. (8), p. 223, 
f(z) = f(O)e-**/". Recalling 1(0) = 0, we find f(0) = —E£/R; thus 
we obtain the expression 


E-E 
I = f(x) + Ż = 2 (1 — e ht 
J) R ra ) 
for the current as a function of the time. 
This expression shows how the current tends asymptotically to its 
steady value E/R when the circuit is closed. 
3.5 The Hyperbolic Functions 
a. Analytical Definition 
In many applications the exponential function enters in combinations 


of the form He + e-*) or (e — e~*), 


It is convenient to introduce these and similar combinations as special 
functions; we denote them as follows: 


A e — e” e +e* 
9 sinh x = ————_-,, cosh z = ————, 
(9a) 5 5 
(9b) tanh r= E, coth e = EE, 


e t e” e— e 
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Figure 3.9 


and we call them the hyperbolic sine, hyperbolic cosine, hyperbolic 
tangent, and hyperbolic cotangent respectively. The functions sinh z, 
cosh x, and tanh 2 are defined for all values of x, whereas for coth x the 
point x = 0 must be excluded. The names are chosen to express a certain 
analogy with the trigonometric functions; it is this analogy, which we 
are about to study in detail, that justifies special consideration of our 
new functions. In Figs. 3.9, 3.10, and 3.11 the graphs of the hyperbolic 
functions are shown; the dotted lines in Fig. 3.9 are the graphs of 
y = (4)e” and y = ($)e~*, from which the graphs of sinh x and cosh x 
may easily be constructed. 

Cosh zv obviously is an even function, that is, a function which remains 
unchanged when 7 is replaced by —2, whereas sinh x is an odd function, 
that is, a function that changes sign when z is replaced by —z (cf. 
p. 29). 
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Figure 3.10 


Figure 3.11 
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By its definition, the function 


e? + e`? 
cosh r = ———— 


is positive and not less than one for all values of x. It has its least 
value when x = 0: cosh 0 = 1. 


The fundamental relation between cosh x and sinh x 
cosh? x — sinh? z = 1, 


follows immediately from the definitions. If we now denote the inde- 
pendent variable by ¢ instead of x and write 


x = cosh tf, y = sinh ¢, 
we have 
a? — y = |; 


that is, the point with the coordinates x = cosh t, y = sinh £ moves 
along the rectangular hyperbola x? — y? = 1l as t ranges over the whole 
scale of values from — œ to +00. According to the defining equation, 
x > l, and our formulas make it evident that y runs through the whole 
scale of values — œ to + ast does; for if ¢ tends to infinity so does 
e', whereas e~‘ tends to zero. We may therefore state more exactly: 
As t runs from — œ to + œ, the equations z = cosh ż, y = sinh ¢ give 
us one branch, namely, the right-hand one, of the rectangular hyper- 
bola. 


b. Addition Theorems and Formulas for Differentiation 


From their definition we obtain the addition theorems for the hyper- 
bolic functions: 


i cosh (a + b) = cosh a cosh b + sinh a sinh b, 
G9 sinh (a + b) = sinh a cosh b + cosh a sinh b. 


The proofs are obtained at once if we write 


ab —a,.—b apb — 
cosh (a + b) = EE, sinh (a + b) = =o 


and insert in these equations 
e? = cosh a + sinha, e-* = cosh a — sinha, 
e? = cosh b + sinh b, e~ = cosh b — sinh b. 


Between these formulas and the corresponding trigonometrical formulas 
there is a striking analogy. The only difference in the addition theorems 
is one sign in the first formula. 
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A corresponding analogy holds for the differentiation formulas. 
Remembering that d(e*)/dx = e”, we readily find that 


L cosh x = sinh 2, < sinh x = cosh z, 


(11) x x 
— tanh z = i ; orig l ; 
dx cosh? x dx sinh? z 


From the first two equations it follows immediately that y = cosh x 
and y = sinh v are solutions of the differential equation 
dy 
dz’ 
which again only differs in sign from the analogous equation satisfied 
by the trigonometric functions cos x and sin x (see p. 171). 


(12) yY, 


c. The Inverse Hyperbolic Functions 


To the hyperbolic functions z = cosh ¢, y = sinh f, there correspond 
inverse functions, which we denote’ by 


t = ar cosh z, t = ar sinh y. 


Since the function sinh ż is monotonic increasing? throughout the 
interval — œ < t < +œ, its inverse function is uniquely determined 
for all values of y; on the other hand, a glance at the graph (see 
Fig. 3.9, p. 229) shows that t = ar cosh v is not uniquely determined, 
but has an ambiguity of sign, because to a given value of x there 
corresponds not only the number ¢ but also the number —t. Since 
cosh t > 1 for all values of ¢, its inverse ar cosh v is defined only for 
z>l. 
We can easily express these inverse functions in terms of the logarithm 
by regarding the quantity e’ = u in the definitions 
ete e — e' 


e . — 


2 f 2 
as unknown and solving these (quadratic) equations for u: 
u=z4vVe-— l, u=y+ vV} +l; 
since u = e’ can have only positive values, the square root in the 


second equation must be taken with the positive sign, whereas in the 


1 The symbolic notation cosh“? z, etc., is also used; cf. footnote, p. 54. 
2 (d/dt) sinh t = cosh t > 0. 
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first either sign is possible (which corresponds to the ambiguity men- 
tioned above). In the logarithmic form, t = log u, and hence 


t = log (x + Vz — 1) = ar cosh z, 


(13) ee 
t = log (y + Vy? + 1) = ar sinh y. 


In the case of ar cosh x the variable x is restricted to the interval 
x > 1, whereas ar sinh y is defined for all values of y. 
Equation (13) gives us two values, 


log (z + Vz? —1) and  log(z— V2? — 1) 


for ar cosh x, corresponding to the two branches of ar cosh z. Since 


(x + V2? — I(x — Vz — 1) =1, 


the sum of these two values of ar cosh x is zero, which agrees with the 
ambiguity in the sign of ¢ mentioned before. 

The inverses of the hyperbolic tangent and hyperbolic cotangent can 
be defined analogously, and can also be expressed in terms of loga- 
rithms. These functions we denote by ar tanh x and ar coth x; ex- 
pressing the independent variable everywhere by x, we readily obtain 


ar tanh x = Lisp. als in the interval —1 < z < 1, 
(14) $ 
1 x+ il . l 
ar coth z = E ; in the intervals x < —1,2> 1. 
X — 


The differentiation of these inverse functions may be carried out by 
the reader himself; he may make use of either the rule for differen- 
tiating an inverse function or the chain rule in conjunction with these 
expressions for the inverse functions in terms of logarithms. If x is the 
independent variable, the results are 


es eee f 3 E ee l ; 
as ” e laa 

— ar tanh x = l CAT ee : 

dx 1 — z? dx — x 


The last two formulas do not contradict each other, since the first holds 
only for —1 <a < 1 and the second only for x < —1 and 1 <z. 
The two values of the derivative d(ar cosh x)/dz, expressed by the sign 
+ in the first formula, correspond to the two different branches of the 


curve y = ar cosh x = log (x + Vz? — 1). 
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d. Further Analogies 


The similarities between the hyperbolic and the trigonometric 
functions are no accident. The deeper source of these analogies 
becomes apparent when we consider these functions for imaginary 
arguments, as we shall do later in Section 7.7a. We shall then be able 
to identify cosh x with cos (ix) and sinh x with (1/i) sin (ix), where 
i= /—1. This fact makes it evident that every relation involving 
trigonometric functions has its counterpart for hyperbolic functions. 
Many of those analogies have interesting geometrical or physical 
interpretations. (See also Chapter 4, p. 363.) 

In the above representation of the rectangular hyperbola by the 
quantity ¢, we did not ascribe any geometrical meaning to the “‘param- 
eter” ¢ itself. We shall now return to this subject, and encounter a 
further analogy between the trigonometric and the hyperbolic functions. 
If we represent the circle with equation z? + y? = 1 by means of a 
parameter ¢ in the form z= cost, y = sin £, we can interpret the 
quantity ¢ as an angle or as a length of arc measured along the cir- 
cumference; we may, however, also regard ¢ as twice the area of the 
circular sector corresponding to that angle, the area being reckoned 
positive or negative depending on whether the angle is positive or 
negative. 

We now state analogously that for the hyperbolic functions the 
quantity ¢ is twice the area of the hyperbolic sector for z? — y? = 1 
shown shaded in Fig. 3.12.’ It is this interpretation of ¢ in terms of 
areas that accounts for the names ¢ = ar cosh x and t = ar sinh y 
given to the inverse hyperbolic functions.” The proof is obtained 
without difficulty if we refer the hyperbola to its asymptotes as axes by 
means of the transformation of coordinates 


x—y = V2€, x+y =~2n, 
or 


(E+), y=- 4); 

with these new coordinates the equation of the hyperbola is ¢7 = 3. 
Hence the two right triangles OPQ and OAB both have area 4, for the 
lengths of OQ and QP are, respectively, ņn and 1/27, and the area in 


1 For a different proof, see p. 372. 

2 Just as the notation ¢ = arccos% refers to an arc of the unit circle, so 
t = ar cosh z refers to an area connected with a rectangular hyperbola z? — y? = 1. 
Incidentally, ¢ is not the length of the hyperbolic arc. 
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Figure 3.13 To illustrate the hyperbolic 
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question is equal to that of the figure ABQP. Obviously the coordinates 
of the points A and B are 


aie and -=,= 
=p 2 J 


respectively, and for double the area of our figure we thus obtain 


E = 


1 pokey r+ y 
2 


(z+) V2 eae. 
2] 20) dy = log (2 + s) = log (x + V2? = D, 
but by Eq. (13), p. 233, this is equal to t, proving our assertion. 

In conclusion, it may be pointed out that, as shown in Fig. 3.13, the 
hyperbolic functions can be graphically represented on the hyperbola, 
just as the trigonometric functions can be represented on the circle. 


3.6 Maxima and Minima 


As the first of a great variety of applications we consider the theory of 
maxima and minima of a function, in conjunction with a geometrical 
discussion of the second derivative. 


a. Convexity and Concavity of Curves 


By definition the derivative f'(x) = df (x)/dx represents the slope of 
the curve y = f(x). The derivative of the function f'(x) or of the 
slope of the curve y = f(x) is given by the derivative df'(x)/dxr = 
d*f(x)/dx? = f"(x), the second derivative of f(x), and so on. If the 
second derivative f”(x) is positive at a point z—so that owing to con- 
tinuity (which we assume) it is positive in some neighborhood? of this 
point z—then throughout this neighborhood the derivative f'(x) in- 
creases with increasing values of x. Hence the curve y = f(z) turns its 
convex side downwards or is “open” upwards. We call the function 
f(x) or the curve y = f(x) convex. If f"(x) is negative, the curve and 
the function are concave. Therefore when f"(x) > 0, the curve in the 
neighborhood of the point lies above the tangent while when f"(x) < 0, 


1 We make use here of the intuitively obvious observation: a continuous function 
g(x) which is positive at a point z, also is positive for all points of a sufficiently 
small neighborhood of 2» (as far as they belong to the domain of g). The formal 
proof is simple. From the continuity of g at zọ we know that for every positive e 
the inequality | g(x) — g(xo)| < € holds for all x in a sufficiently small neighborhood 
|x — xo| < 6 of the point x. Since (xo) > 0, we are free to choose for e the value 
$2(Z), so that | g(x) — g(x»)| < 4e(x) in some neighborhood. Since then g(x) — 
&(x) <| g(x) — g£) < 4g(%o); it follows that g(x) > $2(2) > 0. 
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(a) (b) 
Figure 3.14 (a) f’(x) > 0. (b) f’(x) < 0. 


it lies below the tangent (see Figs. 3.14a and 3.14b) (cf. Problem 4, p. 200 
and Section 5.6). 


Point of Inflection 


Special consideration is required only in points where f"(x) = 0. 
On passing through such a point the second derivative f"(x) will gen- 
erally change its sign. Such a point will then be a point of transition 
between the two cases just indicated; that is, on one side the tangent is 
above the curve and on the other side below it, whereas at this point it 
crosses the curve (see Fig. 3.15). Such a point is called a point of in- 
flection of the curve, and the corresponding tangent is called an in- 
flectional tangent. 

y 


f (x) 


Figure 3.15 Point of inflection. 
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The simplest example is given by the function y = 2%, the cubical 
parabola, for which the z-axis itself is an inflectional tangent at the 
inflection point x = 0 (see Fig. 3.3, p. 209). Another example is given 
by the function f(x) = sin x, for which 


f'(x) = dsinz)/dx = cosx and f”(x)= dX(sin x)/dx? = — sin x. 


Consequently, f'(0) = 1 and f"(0) = 0; since the sign of f”(x) changes 
at x = 0, the sine curve has at the origin an inflectional tangent in- 
clined at an angle of 45 degrees to the z-axis. 

It must, however, be noted that points can exist where f"(x) = 0 and 
the sign of f"(x) does not change with increasing x, while the tangent 
does not cut the curve but remains entirely on one side of it. For 
example, the curve y = 2* lies entirely above the x-axis, although the 
second derivative f"(x) = 12x? vanishes for x = 0. 


b. Maxima and Minima—Relative Extrema. Stationary Points 


A function f(x) has a maximum at a point & if the value of f at the 
point $ is not exceeded by the value of f at any other point x of the 
domain of f; that is, f(€) > f(x) for all x where fis defined.’ Similarly, 
fhas a minimum at £ if f(E) < f(x) for all x in the domain. The word 
extrema is used to cover both maxima and minima. 

The function f(x) = Ni l — xz?, for example, which is defined for 
—] <x < l, has minima at x = +1 and a maximum at x = 0. Itis 
easy to give examples of continuous functions which have no maxima 
or no minima. Thus the function f(x) = 1/(1 + x?) (Fig. 3.8, p. 216) in 
the domain —œ < x < +œ has no minimum; the function f(x) = 
1/x defined for 0 < x < œ has no extremum points at all. We recall, 
however, from Chapter 1, p. 101 the theorem of Weierstrass, according 
to which a continuous function defined in a closed finite interval always 
has a maximum (and similarly a minimum) there. 

Our object is to find a means of locating the extrema of a function or 
curve. This problem which is encountered very frequently in geometry, 
mechanics, physics, and other fields was one of the principal incentives 
for the development of the calculus in the seventeenth century. 

Calculus does not furnish a direct method for picking out the 
extrema of a function f(x), but it permits us to locate the so-called 
relative extremum points, among which the actual maxima and minima 
have to occur. The point ¢ is a relative maximum (minimum) of f if f 


1 We talk of a strict maximum point & if f() > f(x) for all z in the domain of f that 
are different from £. 
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has its greatest (least) value at € when compared not with all possible 
values of f(x) but just with the values of f(x) for x in some neighborhood 
of £. By a neighborhood of the point $ we mean here any open interval 
a <x < # which contains the point € but may be arbitrarily small. 
A relative extremum point ¢ of fis then a point which is an extremum 
point when / is restricted to all those points of its domain lying suf- 
ficiently close to .! Obviously, the extrema of the function are in- 
cluded among the relative extrema. To avoid confusion we shall use 


Figure 3.16 Graph of function defined on the interval [a, b] with relative minima at 
XL = A, £y, Z4, Xe, relative maxima at x, z3, Zs, b, absolute maximum at b, and absolute 
minimum at z4. 


the terms absolute maxima (minima) for the maxima and minima of 
f in its entire domain (see Fig. 3.16). 

Geometrically speaking, relative maxima and minima, if not located 
in the end points of the interval of definition, are respectively the wave 
crests and troughs of the curve. A glance at Fig. 3.16 shows that the 
value of a relative maximum at one point x; may very well be less than 
the value of a relative minimum at another point z,. The diagram also 
suggests the fact that relative maxima and minima of a continuous 
function alternate: Between two successive relative maxima there is 
always located a relative minimum. 

Let f(x) be a differentiable function defined in the closed interval 
a <x< b. We see at once that at a relative extremum point which is 


1 The formal definition of a relative maximum point E would state that there exists 
an open interval containing § such that f(¢) 2 f(x) for all x of that interval for 
which f is defined. 


240 The Techniques of Calculus Ch. 3 


located in the interior of the interval the tangent to the curve must be 
horizontal. (The formal proof is given below.) Hence the condition 


f'( =0 


is necessary for a relative extremum at the point £ witha < & < b. 
If, however, f(&) is a relative extremum and é coincides with one of the 
end points of the interval of definition, the equation f’(é) = 0 need not 
hold. We can only say that if the left-hand end point is a relative 
maximum (minimum) point, the slope f'(a) of the curve cannot be 
positive (negative), while if the right-hand end point b is a relative 
maximum (minimum) then f'(b) cannot be negative (positive). 

The points at which the tangent to the curve y = f(z) is horizontal, 
corresponding to the roots & of the equation f’(¢) = 0, are called the 
critical points or stationary points of f. All relative extrema of a 
differentiable function f which are interior points of the domain of f 
are stationary points. Hence: an absolute maximum or minimum of the 
function coincides either with a critical point of the function or with 
an end point of its domain. In order to locate the absolute maxima 
(minima) of the function we have only to compare the values of fin the 
critical points and in the two end points and to see which of these values 
are greatest (least). If f fails at a finite number of points to have a 
derivative, we have only to add those points to the list of possible 
locations of an extremum and to check also the values of f at those 
points. Thus the main labor in determining the extrema of a function 
is reduced to that of finding the zeros of the derivative of the function, 
which usually are finite in number. 

To take a simple example, let us determine the largest and smallest 
values of the function f(z) = x? — r? in the interval —2 < x < 2. 
Here the critical points, the roots of the equation f'(x) = 6(”° — x)/10=0 
are located at x = 0, +1, —1. Computing the values of f at those 
points and also at the end points of the interval, we find 


a —2 — 1 0 l 2 


Fæ) 52  —0.2 0 —0.2 52 


It is clear that the points x = +1 represent relative minima, whereas 
relative maxima occur at x = 0 and x = +2. The maximum value of 
the function, assumed in the end points of the interval, is 5.2; the 
minimum value, assumed in the points x = +1, is —0.2 (see Fig. 3.17). 

Without appealing to intuition we can easily prove by purely analytic 
methods that f’(¢) = 0 whenever ¢ is a relative extremum point in the 
interior of the domain of f provided f is differentiable at ¢. (Compare 
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Figure 3.17 y = (x! — 3x7)/10. 


the exactly analogous considerations for Rolle’s theorem, p. 175.) If 
the function f(x) has a relative maximum at the point ¢, then for all 
sufficiently small values of A different from zero the expression 
S(€é + h) — f(E) must be negative or zero. Therefore 


[/(E + — f(§)] <0 
for h > 0, whereas 
[AE + h) — f(x) S 


7 0 


for h <0. Thus if h tends to zero through positive values, the limit 
cannot be positive, whereas if A tends to zero through negative values, 
the limit cannot be negative. However, since we have assumed that the 
derivative at £ exists, these two limits must be equal to one another, 
and, in fact, to the value f’(¢), which therefore can only be zero; we 
must have f (£) = 0. A similar proof holds for a relative minimum. The 
proof also shows that if the left-hand end point ¢ = a is a relative 
maximum (minimum) point, then at least f'(a) < 0 [f'(a) > 0); if the 
right-hand end point b is a relative maximum (minimum) point, then 


Sf) = Of) < 0). 
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The condition f'(&) = 0 characterizing the critical points is by no 
means sufficient for the occurrence of a relative extremum. There may 
be points at which the derivative vanishes, that is, at which the tangent 
is horizontal, although the curve has neither a relative maximum nor 
minimum there. This occurs if at the given point the curve has a 
horizontal inflectional tangent cutting it, as in the example of the 
function y = x? at the point x = 0. 

The following test gives the conditions under which a critical point 
is a point of relative maximum or minimum. It applies to a continuous 
function f, having a continuous derivative f” which vanishes at most 
at a finite number of points or, more generally to differentiable functions 
J for which f’ changes sign at most at a finite number of points: 


The function f(x) has a relative extremum at an interior point £ of its 
domain if, and only if, the derivative f'(x) changes sign as x passes through 
this point; in particular, the function has a relative minimum if near & 
the derivative is negative to the left of E and positive to the right, whereas 
in the contrary case it has a maximum. 


We prove this rigorously by using the mean value theorem. First, 
we observe that to the left and right of £ there exist intervals; < x < é 
and E < x < &,, in each of which f'(x) has only one sign, since f’ 
vanishes only at finite number of points. (Here €, and &, can be taken 
as the points nearest to € at which f’ vanishes, if such points exist.) If 
the signs of f'(x) in these two intervals are different, then f(E + h) — 
f(E) = hf'(E + 6h) has the same sign for all numerically small values 
of h, whether A is positive or negative, so that £ is a relative extremum. 
If f'(x) has the same sign in both intervals, then Af’(é + 6h) changes sign 
when A does, so that f(E + h) is greater than f(E) on one side and less 
than /(¢) on the other side, and there is no extreme value. Our theorem 
is thus proved. 

At the same time we see that the value f(E) is the greatest or least 
value of the function, in every interval containing the point &, in which f 
is differentiable and in which the only change of sign of f'(x) occurs at £ 
itself. 

The mean value theorem on which this proof is based can still be 
used if f(x) is not differentiable at an end point of the interval in which 
it is applied, provided that f(z) is differentiable at all the other points of 
the interval; hence this proof still holds if f'(x) does not exist at 
a = $. For example, the function y = |x| has a minimum at x = 0, 
since y’ > 0 for x > 0 and y’ < 0 for x < 0 (cf. Fig. 2.24, p. 167). 
The function y = Vz? likewise has a minimum at the point x = 0, 
even though its derivative §2~” is infinite there (cf. Fig. 2.27, p. 169). 
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The simplest method for deciding whether a critical point ë is a 
relative maximum or minimum involves the second derivative at that 
point. It is intuitively clear that if f'(£) = 0, then f has a relative 
maximum at & if f’(€) < 0, and a relative minimum if f’(é) > 0. For 
in the first case the curve in the neighborhood of this point lies com- 
pletely below the tangent, and in the second case completely above the 
tangent. This result follows analytically from the preceding test, 
provided that f(x) and f'(x) are continuous and that f”(¢) exists. For 
if f (£) = 0 and, say, f"(é) > 0, we have 


h>0 h-0 h 


It follows that f’(€ + A)/h > 0 for all h 40 which are sufficiently 
small in absolute value; hence f(E + h) and A have the same sign in a 
neighborhood of &. For x near & the derivative f'(x) must be negative 
for x to the left of , and positive for x to the right of &; this 
implies that there is a relative minimum at é. 

The situation is particularly simple in case f”(x) is of one and the 
same sign throughout the interval [a, b] in which f is defined: 


A point & at which f’ vanishes is a maximum point of f if f"(x) < 0 
throughout the interval (or if its curve is concave), and a minimum 
point of f if throughout the interval f"(x) > O (that is, if the curve is 
convex). 


Indeed, if f"(x) <0 the function f'(x) is monotonic decreasing, 
hence has € as its only zero. Moreover, f’ > 0 fora < x < &, whereas 
J’ <0 for & <x <b. By the mean value theorem this implies again 
that f(x) < f(E) for x Æ &, so that £ turns out to be a strict maximum 
point. The minimum of f must coincide with one of the end points 
since there is no other critical point besides . The same argument 
applies when f” > 0 in the interval. 


Examples 


Example 1. Of all triangles with given base and given area, to find 
that with the least perimeter. 

To solve this problem, we take the x-axis along the given base AB 
and the middle point of AB as the origin (Fig. 3.18). If Cis the vertex 
of the triangle, h its altitude (which is fixed by the area and the base), 
and (x, h) are the coordinates of the vertex, then the sum of the two 
sides AC and BC of the triangle is given by 


f(a) = V(@ + a} + k + Ve — a} + k, 
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Figure 3.18 


where 2a is the length of the base. From this we obtain 


+a x— a 
fx) = se a Ae ee ee 
V(aatart+h Vz- a} +h? 
” —(x + a)? 1 
f (x) = ee — g» 
Viz +a? + kP Vh +a th 
—(x — a)’ 1 
Vle- aP HRP Væ a h 
h? h? 


= — F a. 
Viet ar HRP vi a? T 
We see at once (1) that f’(0) vanishes, and (2) that f"(x) is always 
positive; hence at x = 0 there is a least value (see p. 243). This least 
value is accordingly given by the isosceles triangle. 
Similarly, we find that of all the triangles with a given perimeter and a 
given base the isosceles triangle has the greatest area. 


Example 2. To find a point on a given straight line such that the 
sum of its distances from two given fixed points is a minimum. 

Let there be given a straight line and two fixed points A and B on the 
same side of the line. We wish to find a point P on the straight line such 
that the distance PA + PB has the least possible value.’ 

We take the given line as the z-axis and use the notation of Fig. 3.19. 
Then the distance in question is given by 


f(z) = V2 + + V(e a} +h, 


1 If A and B lie on opposite sides of the line, P obviously is just the intersection of 
the line with the segment AB. 
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Figure 3.19 Law of reflection. 


and we obtain 


Do sc as See 
J Va? +k V(a — a)? + hè 
ra= —} ha. 


E ee, 
V +h Vi a? + hè? 
The equation f (£) = 0 means 


E Z 7E _ 
VÈR VE-a) +h 
or 
cos « = cos f; 


hence the two lines PA and PB must form equal angles with the given 
line. The positive sign of f”(x) shows us that we really have a least 
value. 

The solution of this problem is closely connected with the optical 
law of reflection. By an important principle of optics, known as 
Fermat’s principle of least time, the path of a light ray is determined by 
the property that the time the light takes to go from a point A toa 
point B under the given conditions must be the least possible. If 
the condition is imposed that a ray of light shall on its way from A to B 
pass through some point on a given straight line (say on a mirror), we 
see that the shortest time will be taken along the ray for which the 
“angle of incidence” is equal to the “angle of reflection.” 
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Example 3. The Law of Refraction.’ Let there be given two points 
A and B on opposite sides of the z-axis. What is the path from A to B 
requiring the shortest possible time if the velocity on one side of the 
x-axis is cı and on the other side c}? 

Clearly, this shortest path must consist of two portions of straight 
lines meeting one another at a point P on the z-axis. Using the notation 


of Fig. 3.20, we obtain the two expressions Vh + 22 and Vhy? + (a — x)? 


y 


Figure 3.20 Law of refraction. 


for the lengths PA, PB, respectively, and we find the time of passage 
along this path by dividing the lengths of the two segments by the 
corresponding velocities and then adding; 


f(x) = LE h+ L Vh? + (a — xý. 


Cy Co 


By differentiation, we obtain 


Posa x ti a— r 
Ci Vh? H 2 ca Vh? +(a— x) 
1 h? 1 h,” 
Fa = ~ 5 + — 


cyh F cv fh + (a— oP 


1 While the preceding examples can be treated also by elementary geometry, 
this one is not easily disposed of without calculus. 
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As we readily see from Fig. 3.20, the equation f'(x) = 0, that is, the 
equation. 


1 x 1 a—x 


avert a Ovni (a a 
is equivalent to the condition (1/c,) sin a = (1/c.) sin 8, or 
sina G 
sinf Cp 
The reader should verify the fact that there is only one point which 


satisfies this condition and that this point actually yields the required 
least value. 


Figure 3.21 Point on ellipse having the least distance from a point on the major axis. 


The physical meaning of our example is again given by the optical 
principle of least time. A ray of light traveling between two points 
describes the path of shortest time. If c} and c, are the velocities of 
light on either side of the boundary of two optical media, the path of the 


light will be that given by our result, which is a form of Snell’s law of 
refraction. 


Example 4. Find the point of an ellipse which is closest to a given 
point on its major axis (Fig. 3.21). 
Taking the ellipse in the form 
2 y? 
+ p = 1 (b <a) 


8/8 
bo 
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and the given point on the major axis as (c, 0), we find for the distance 
of any point (x, y) on the ellipse from the point (c, 0), the expression 


NETET a, 


where —a < x <a. The function f(x) = d? is convex (f” > 0). It 
has a minimum for the same x as d itself. The only critical point of f 
isatz = c/(1 — b?/a?). If this point lies in the domain of d, it represents 
the minimum point; if not, the minimum of d corresponds to the end 
point of the major axis closest to c. We find accordingly for the 
minimum distance the values 


2 ; b? 
25/12 < if |el< (1-4). 
a? þ? lel < at a’ 


b? 
d =a — |c] if el > a(1 = 5) 


a? 


*3.7 The Order of Magnitude of Functions 


Differences in the behavior of functions for large values of the 
argument, lead to the notion of the order of magnitude. Because of its 
great importance, this matter deserves a brief discussion here even 
though it is not directly connected with the idea of the integral or of the 
derivative. 


a. The Concept of Order of Magnitude. The Simplest Cases 


If the variable x increases beyond all bounds, then, for « > 0, the 
functions z*, log x, e”, e** also increase beyond all bounds. They 
increase, however, in essentially different ways. For example, the 
function z? becomes “infinite to a higher order” than z?; by this we 
mean: as x increases, the quotient z?/x? itself increases beyond all 
bounds. Similarly, the function z* becomes infinite to a higher order 
than x’ if « > f > 0, etc. 

Quite generally, we shall say of two functions f(x) and g(x), whose 
absolute values increase with x beyond all bounds, that f(x) becomes 
infinite of a higher order than g(x) if for x — œ the quotient | f(x)/g(x)| 
increases beyond all bounds; we shall say that f(x) becomes infinite of a 
lower order than g(x) if the quotient |f(x)/g(x)| tends to zero as x 
increases; and we shall say that the two functions become infinite of 
the same order of magnitude if as x increases, the quotient | f (x)/g(æ)| 
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possesses a limit different from zero or at least remains between two 
fixed positive bounds. For example, the function axr? + br? + c = 
I(x), where a ¥ 0, will be of the same order of magnitude as the function 
x? = g(x); for the quotient | f(x)/g(x)| = |(ax + bx? + c)/x| has the 
limit |a| as x — œ; on the other hand the function x? + x + 1 becomes 
infinite of a higher order of magnitude than the function 2? + x + 1. 

A sum of two functions f(x) and (x), where f(x) is of higher order of 
magnitude than ¢(x), has the same order of magnitude as f(x). For 
f(x) + O(x))/ f(x)| = |1 + P(x)/f(x)|, and by hypothesis this expres- 


sion tends to one as x increases. 


b. The Order of Magnitude of the Exponential 
Function and of the Logarithm 


We might be tempted to measure the order of magnitude of functions 
by a scale, assigning to the quantity x the order of magnitude one and to 
the power 2% (a > 0) the order of magnitude «x. A polynomial of the 
nth degree then obviously would have the order of magnitude n; a 
rational function, the degree of whose numerator is higher by / than 
that of the denominator, would have the order of magnitude A. 

It turns out, however, that any attempt to describe the order of 
magnitude of arbitrary functions by the foregoing scale must fail. 
For there are functions that become infinite of higher order than the 
power x7 of x, no matter how large « is chosen; again, there are 
functions which become infinite of lower order than the power 2%, 
no matter how small the positive number « is chosen. These functions 
therefore will not fit in our scale. 

Without entering into a detailed theory we state the following 
theorem. 


THEOREM. Jf a is an arbitrary number greater than one, then the 
quotient a*/x tends to infinity as x increases. 
PROOF. To prove this we construct the function 


f(x) = log + = x log a — log z; 
x 


it is obviously sufficient to show that (x) increases beyond all bounds if 
x tends to +00. For this purpose we consider the derivative 


$'(x) = log a — Ż 


250 The Techniques of Calculus Ch. 3 


and notice that for x > c = 2/loga this is not less than the positive 
number $ log a. Hence it follows that for z > c 


d(x) — d(c) = fro dt >f% log a dt > (x — c) log a, 
$(x) > $c) + (z — c) log a, 


and the right-hand side becomes infinite for x — oo. 

We give a second proof of this important theorem: with Ja= 
b = 1 + h, we have b > 1 andh>O. Let n be the integer such that 
n<jx<n+1; we may take x>1, so thatn>1. Applying the 
lemma of p. 64, we have 


of Ee EY 1+nh nh _ h 


x fe ce Jnt1° fn+1° J2n V2 


so that 


and therefore tends to infinity with zx. 


From the fact just proved many others follow. For example: for 
every positive index a and every number a > 1 the quotient a’/z* tends 
to infinity as x increases; that is, 


THEOREM. The exponential function becomes infinite of a higher order 
of magnitude than any power of x. 


For the proof we need show only that the ath root of the expression, 
that is, 


tends to infinity. This, however, follows immediately from the pre- 
ceding theorem when z is replaced by y = 2/a. 

In a similar fashion we prove the following theorem. For every 
positive value of « the quotient (log x)/x* tends to zero for x —> 0; 
that is 


THEOREM. The logarithm becomes infinite of a lower order of 
magnitude than any arbitrarily small positive power of x. 

PROOF. The proof follows immediately if we put log x = y so that 
our quotient is transformed into y/e*”. We then put e" =a; then 
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a> l, and our quotient y/a” approaches zero as y tends to infinity. 
Since y approaches infinity as x does, our theorem is proved. 

On the basis of these results we can construct functions of an order 
of magnitude far higher than that of the exponential function and other 
functions of an order of magnitude far lower than that of the logarithm. 
For example, the function e” is of a higher order than the exponential 
function, and the function log log x is of a lower order than the loga- 
rithm; moreover we can iterate these processes as often as we like, 
piling up the symbols e or log to any extent we please. 

All the functions x, log x, log (log x), log [log (log x)], etc., eventually 
become arbitrarily large for sufficiently large x, but with increasing 
slowness. Taking, for example, for x the tremendous number x = 101% 
we find that log v is about 230, whereas log (log x) is only about 5.4. 


c. General Remarks 


These considerations show that it is not possible to assign to all 
functions definite numbers as orders of magnitude so that of two 
functions the one with the higher order of magnitude has a higher 
number. If, for example, the function x is of the order of magnitude 
one and the function x'** of the order of magnitude 1 + «e, then the 
function x log x must be of an order of magnitude that is greater than 
one and less than 1 + « no matter how small e is chosen. But there 
is no such number. 

In addition, it is easy to see that functions need not possess a clearly 
defined relative order of magnitude at all. For example, the function 
[x2(sin x)? + x + 1)/[x*(cos x)? + x] approaches no definite limits as x 
increases; on the contrary, for x = nr (where n is an integer) the value 
is 1/na, whereas for x = (n + })r it is (n + r + 1 + In + 4x. 
Although the numerator and denominator both become infinite, the 
quotient neither remains between positive bounds nor tends to zero nor 
tends to infinity. The numerator, therefore, is neither of the same order 
as the denominator, nor of lower order nor of higher order. This 
apparently startling situation merely means that our definitions are not 
designed in such a way that we can compare every pair of functions. 
This is not a defect; we have no desire to compare the orders of such 


1 Another simple proof may be suggested: For x > 1 ande > 0 


logz =| $f bet dé = L (xe — 1); 
16 Jı € 


if we choose e equal to « and divide both members of this inequality by x*, then it 
follows that (log x)/x* —— Oasx— œ. 
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functions as the numerator and denominator above; knowledge of the 
value of one of them gives us no useful information about the other. 


d. The Order of Magnitude of a Function in 
the Neighborhood of an Arbitrary Point 


Just as we may compare the behavior of functions for x — œ 
we may also compare functions that become infinite at the finite point 
x= &, 

We say that the function f(x) = 1/|z — &| becomes infinite of the 
first order at the point x = &, and correspondingly that the function 
1/|z — &|* becomes infinite of the order «, provided that « is positive. 

We recognize then that the function e!/'*-§! becomes for x — & 
infinite of higher order and the function log |x — &| infinite of lower 
order than all these powers; that is, that the limiting relations 


lim (|x — £|": e!) = œ and lim (ja — é|*- log |a — £) = 0 
æ> g= 


hold. 
To confirm this we merely put 1/|z — ¢| = y; our statements then 
reduce to the known theorem on p. 249, since 
Y 
|e — é|%- eY% = E and lz — é|*- log |x — &| e? _ logy 


æ a 


and y increases beyond all bounds as x tends to . (The method of 
reducing the behavior at a point § to the behavior at infinity by the 
substitution 1/|z — é| = y frequently proves useful.) 


e. The Order of Magnitude (or Smallness) of a 
Function Tending to Zero 


Just as we seek to describe the approach of a function to infinity by 
means of the concept of order of magnitude, we may also specify the 
way in which a function approaches zero. We say that as x — oo the 
quantity 1/x vanishes to the first order, the quantity z~*, where « is 
positive, to the order «. We find once again that the function |/log x 
vanishes to a lower order than an arbitrary power x~*, that is, for every 
positive « the relation 

lim (x7? log x) = 0 


zc 


holds. 
In the same way we say that for x = ¢ the quantity x — & vanishes to 
the first order, the quantity {x — &|* to the order a. With our results 
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it is easy to prove the relations 


lim (ja|*- log |z) = 0, lim ({a[-*» e#!) = 0, 
2-0 x—+0 


which are usually expressed as follows: 


The function 1/log |x| vanishes as x —> 0 to a lower order than an 

& y 
power of x; the exponential function e~''""| vanishes to a higher order 
than any power of x. 


f. The “O” and “0o” Notation for Orders of Magnitude 


A convenient way to indicate that a function f(x) is of lower order of 
magnitude than a function g(x) is to write f = o(g). This symbolic 
equation signifies only that the quotient //g has the limit zero, and can 
be used to equal advantage for functions vanishing or becoming infinite 
and for arguments x tending to infinity or approaching a value &.1 

We can rewrite many of the results of the previous section in this 
notation; for example, 


x = o(xf) fora < ß as t —> 0 
log x = o(x*) fora >0 as x —> 00 
e” = 0(x*) as %—> oO 
e /* = o(x") as2—>Q through positive values 


log |x| = o(1/z) as x—>0O 
1 — cos x = o(x) as x — Q. 


This notation, introduced by E. Landau, is useful for indicating the 
order of magnitude of the error in an approximation formula. For 
example, 


1 1 1 
-— = + + o(Ż) for t — œ 
Vi +4 2¢  \z 
stands for the relation 
E E 
LEE a 
z> o 1/x l 


1 The letter o is chosen to suggest the word “order.” Observe that the relation 
f = olg) for vanishing g means that f vanishes of higher order. 
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Similarly, the relation between increment and differential of a function 
J which has a derivative at the point z can be written in the form 


fi +h) — f(x) = hf (£) + olh) for h—0O. 


Equally useful is the symbolic notation f = O(g) to indicate that 
J (x) is at most of the order of magnitude of g(x), that is, that the quotient 
f(x)/g(x) is bounded for the values of x in question.’ Use of the symbol 
O is again very flexible. Thus the phrase “f = O(g) for z — œ” means 
that the quotient f/g is bounded for all sufficiently large x as in 


J102 — 1 = O(N x) for z— œ. 


Similarly, “f = O(g) for x — &” means that f/g is bounded in a suffi- 
ciently small neighborhood of the point z = ¢ as in 


e” — 1 = O(x) for x — 0. 


More generally we can use the equation f = O(g) to indicate the bound- 
edness of f/g in any domain of the z-axis without requiring x to approach 
a limit. Thus 


log z = O(2) for z > 1, 
x = O(sin x) for |z| < ; , 


Some of the earlier examples involving the symbol o can now be 
refined to indicate a better estimate of the error with the help of the 
symbol O. Thus we have for a function f for which f” is defined and 
continuous 


fæ + h) — f(x) = hf’ (x) + O) for h — 0. 


Other examples are 


EESE of), 
V1 +4r 2x x" 
cos x = 1 + O0(2*) forall z. 


The same notations can be used for sequences a,, letting the index n 
tend to infinity. We shall meet some interesting examples of such 
“asymptotic” formulas with an error term of higher order in the sequel 
(cf. Stirling’s formula for n! on p. 504). A famous asymptotic law,” 


1 Notice that f = O(g) does not mean that f/g has the limit one or that the quotient 
necessarily has any limit at all. 

2 The proof cannot be given in this book. See A. E. Ingham, The Distribution of 
Primes, Cambridge University Press, 1932. 
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already mentioned in Chapter 1, p. 56 states that the number z(n) of 
primes Tess than n is given approximately by n/(log n). Here the order 
of magnitude of the error also has been found and we have more 
precisely the result 


n(n) = — +o 7). 
logn log“ n 


Appendix 


The difficulty in appreciating a rigorous development of calculus 
stems from a basic dilemma: Although the fundamental concepts and 
procedures, such as continuity, smoothness, etc., are motivated by 
compelling intuitive needs, they must be made precise in order to have 
any logical meaning, and the resulting rigorous definitions may cover 
phenomena beyond those of intuitive character. Thus the rigorous 
concept of continuity inevitably requires a degree of abstraction not 
completely reflected in the naive notion of a connected curve, and the 
concept of differentiability is more restrictive and more abstract than 
the vague idea of smoothness of a curve suggests. Discrepancies of this 
sort are not avoidable and may tax the patience and understanding of a 
beginner or of someone for whom logical finesse is not of primary 
interest. Nevertheless, we want to make the need for precision clearer 
to the reader by showing that, perhaps unexpectedly, precision and 
refinement are called for even by simple and intuitively comprehensible 
examples. 


A.1 Some Special Functions 


As arule such examples need not be given in terms of single analytical 
expressions (see Figs. 2.28, p. 38 and 1.30, p. 39). Here, however, we 
wish to represent various typical discontinuities and “‘abnormal’’ or 
unexpected phenomena by very simple expressions constructed from 
the elementary functions. We begin with an example in which no 
discontinuity is present. 


a. The Function y = eY” 


This function (cf. Fig. 3.22) is defined in the first instance only for 
values of x other than zero, and obviously has the limit zero as x — 0. 
For by the transformation 1/x? = & our function becomes y = e~* and 
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lim eë = 0. Hence it is natural to extend our function so that it is 


f+ o 
continuous for x = 0 by defining the value of the function at the point 
x = 0 as y(0) = 0. 

By the chain rule the derivative of our function for z # 0 is y’ = 
—(2/x3)e-1/** = 2&%e-§_ If x tends to zero, this derivative also has the 
limit zero, as we find immediately from p. 250. At the point x = 0 
itself the derivative 


y'(0) = lim y(h) — y(0) = lim 


h>0 nao h 


2 
enh 


can also be continuously defined as zero. 


yY 


Figure 3.22 


For the higher derivatives when x Æ 0, we obviously always obtain 
the product of the function e~!/** and a polynomial in 1/z, and the 
passage to the limit x — 0 always yields the limit zero. Hence all the 
higher derivatives vanish, like y’, at the point x = 0. 

Thus our function is continuous everywhere and differentiable as 
many times as we please, and yet at the point x = 0 it vanishes with all 
its derivatives and yet does not vanish identically. We shall later realize 
(Appendix 1.1 in Chapter 5) how remarkable or “abnormal” this 
behavior is. 


b. The Function y = e!" 


As easily seen, for positive values of x this function behaves in the 
same way as the function just dealt with; if x tends to zero from one 
side, through positive values, the function tends to zero, and the same is 
true of all its derivatives. If we define the value of the function at 
x = 0 as y(0) = 0, all the right-hand derivatives at the point x = 0 have 
the value zero. It is quite another matter when z tends to zero through 
negative values; for then the function and all its derivatives become 
infinite, and left-hand derivatives at the point x = 0 do not exist. At 
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the point x = 0, therefore, the function has a remarkable sort of 
discontinuity, quite unlike the infinite discontinuities of a rational 
functions considered on pp. 36, 37 (cf. Fig. 3.23). 


y 


Figure 3.23 


l 
c. The Function y = tanh - 
r 


As already seen on p. 65, functions with “jump” discontinuities 
can be obtained from simple functions by a passage to the limit. The 
exponential function defined on p. 151 together with the principle of 
compounding of functions give us another method for constructing 
functions with such discontinuities from elementary functions, without 
any further limiting process. An example of this is the function 


ell ram e l/s 


and its behavior at the point x = 0. The function is in the first instance 
not defined at this point. If we approach the point x = 0 through 
positive values of x, we obviously obtain the limit 1; if, on the other 
hand, we approach the point x = 0 through negative values, we 
obtain the limit —1. This point x = 0 is thus a point of jump dis- 
continuity; as x increases through 0 the value of the function jumps by 
2 (cf. Fig. 3.24). On the other hand, the derivative 


1 l 

ag cosh? (1/x) x? 
an a 
= a? (el? 4 eWay 


1 
y = tanh - = 
x 
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Figure 3.24 


approaches the limit zero from both sides, as follows readily from! 
Section 3.76, p. 249. 


d. The Function y = x tanh 1/2 


In the case of the function 


1/z —1/x 


1 e7 — e 
yY =x tani = r oe 4 pie 
the preceding discontinuity is removed by the factor x. This function 
has the limit zero as x — 0 from either side, so that we can again 
appropriately define y(0) as equal to zero. Our function is then con- 
tinuous at x = 0, but its first derivative 


1 1 
x cosh? (1/2) 


1 


y =e 
x 


has just the same kind of discontinuity as the preceding example. 
The graph of the function is a curve with a corner (cf. Fig. 3.25); at 
the point x = 0 the function has no actual derivative but a right-hand 
derivative with the value +1 and a left-hand derivative with the value 
—1. 


1 Another example of the occurrence of a “‘jump”’ discontinuity is given by the func- 
tion y = arc tan l/r asx — 0, 
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y=xtanht 


Figure 3.25 


e. The Function y = x sin |/x, y(0) = 0 


We have already seen that this function is not composed of a finite 
number of monotonic pieces—as we may say, it is not “‘sectionally”’ 
or “piecewise?” monotonic—but that it is nevertheless continuous 
(p. 40 and Fig. 1.31). Its first derivative 


ee eres (x Æ 0), 
x x £ 

on the contrary, has a discontinuity at x = 0; for as x tends to zero 
this derivative oscillates continually between bounding curves, one 
positive and one negative, which themselves tend to +œ and — œ 
respectively. At the actual point x = 0 the difference quotient is 
[y(h) — y(0)]/h = sin (1/h); since this expression swings backward and 
forward between | and — 1 an infinite number of times as h — 0, the 
function possesses neither a right-hand derivative nor a left-hand 
derivative at x = 0. 


A.2 Remarks on the Differentiability of Functions 


The derivative of a function which is continuous and has a derivative 
at every point of an interval need not be continuous. 
As a simple example we consider the function given by 


VETS aa for x #0 
x 


and 
{(0) = 0. 
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different from zero the derivative is given by the expression 


f'(x) = —2*(cos 1h Ope tos. oni: 
v: x x x x 


When z tends to zero, f'(x) has no limit. If, on the other hand, we 
form the difference quotient [f (h) — f(0)]/h = (h? sin 1/h)/h = h sin 1/h, 


y 


Figure 3.26 


we see at once that this tends to zero as h does. The derivative therefore 
exists for x = 0 and has the value 0. 

To grasp intuitively the reason for this paradoxical behavior we 
represent the function graphically (cf. Fig. 3.26). It oscillates between 
the curves y = x? and y = —zr?, which it touches alternately. Thus 
the ratio of the heights of the wavecrests of our curve and their distances 
from the origin steadily becomes smaller. Yet these waves do not 
become flatter, for their slope, given by the derivative 


f(#) = 2x sin 1/x — cos 1/zx, 
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is equal to —1 at the points x = 1/2n7 where cos I/x = 1, and to +1 
at the points x = 1/(2n + 1) where cos I/z = —1. 


In contrast to the possibility illustrated here (that a derivative exist 
everywhere and yet not be continuous) we state the following simple theorem, 
which throws light on a whole series of earlier examples and discussions. 


THEOREM. If we know that in a neighborhood of a point x = a, the function 
J (x) is continuous, and that for x # a it also has a derivative f(x) and if in 
addition the equation lim f'(x) = b holds, then the derivative f'(x) exists at the 

ra 
point a also, and f(a) = b. 

PROOF. The proof follows immediately from the mean value theorem. 
For we have [f(a + h) — f(a)]/h =f'(é), where & is a value intermediate 
between a anda + h. If h now tends to zero by hypothesis f’() tends to b, 
and our statement follows at once. 

A companion theorem may be proved in a similar way: If the function f(x) 
is continuous ina <x < band for a < x < b possesses a derivative which 
increases beyond all bounds as x tends to a, the right-hand difference quotient 
[f(a + h) — f(a)]/h also increases beyond all bounds as h tends to zero, so 
that no finite right-hand derivative exists at x = a. The geometrical meaning 
of this statement is that at the point with the (finite) coordinates [a, f(a)] the 
curve has a vertical tangent. 
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Explicit Functions 


A wide class of functions can be constructed from the elementary 
functions? by repeated rational operations, that is, addition, multi- 
plication, division, and furthermore by the operations of forming 
inverse functions and of compounding functions. The functions thus 
described form the class of “explicit” functions or “closed expressions. ”? 


As a result of Part A of this chapter we state the rather general fact: 


Every explicit function can be differentiated and its derivative is again an 
explicit function. 

Thus we have attained a fairly complete mastery of the operation or 
the “algorithm” of differentiation. Yet, the inverse process, that of 


1 It should be emphasized that the distinction between ‘elementary’ and ‘“‘explicit”’ 
functions and others is in itself somewhat arbitrary. For us the term “elementary” 
function includes just the rational functions, the trigonometric and exponential 
functions, and their inverses. 

2? This name indicates that we shall encounter many other functions which cannot 
be represented in this fashion but which can be constructed by means of limiting 
processes such as infinite series. 
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integration, is generally speaking more important and presents the 
major challenge. To a certain extent the challenge is met by the 
fundamental theorem of calculus: To every formula of differentiation 
F(x) = f (x) there corresponds an equivalent formula for the primitive 
functions F(x) to f(x) or the integral: 


| f(a) dz = F(z). 


More precisely we have F(x) = | AO du + constant). Thus as 


a 
more explicit formulas of differentiation are derived, additional explicit 
functions can be integrated in terms of explicit functions. A first table 
of integrals is listed on p. 264; in principle, it would not be difficult, 
although impractical and confusing, to extend such a table very much. 

In the early phases of the development of calculus many mathema- 
ticians tried to find, in explicit or closed form, the integral or primitive 
function for every explicitly given function. 

It took some time before it was realized that in principle this problem 
cannot be solved; on the contrary, for some quite elementary inte- 
grands the integral just cannot be expressed in terms of elementary 
functions (see p. 298). Thus the need for studying new types of functions 
generated by integration processes from elementary functions became 
an important stimulus for the development of analysis. Nevertheless the 
desire to integrate—when feasible—given explicit functions explicitly 
without getting hopelessly entangled in tedious consultation of tables 
or numerical computations has led to some simple devices which provide 
a certain flexibility for transforming given integrals; in fact, these 
devices permit us to carry out the integration by reduction to one of the 
elementary integrals in the Table of Integrals. 

Section 3.9 will be devoted to the development of such useful devices. 
In this connection the beginner should be cautioned against merely 
memorizing the many formulas obtained by using these technical 
devices. The student should instead direct his efforts toward gaining 
a clear understanding of the methods of integration and learning how to 
apply them. Moreover, he should remember that even when inte- 
gration by these devices is impossible, the integral does exist (at least 
for all continuous functions), and can actually be calculated to as high a 
degree of accuracy as is desired by means of numerical methods which 
will be further developed later (Section 6.1). 

In Part C of this chapter we shall endeavor to extend our conceptions 
of integration and integrai, quite apart from the problem of the tech- 
nique of integration. 
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Table of Elementary Integrals 
F(x) =f) F(z) = Í fle) de 
xatl 
l. x°? (a# =] 
xa ( ) PETE 
2 ! ] 
z og |2| 
3. e? eT 
aT 
4. a® (a#1) 
log a 
5. sinx —cos x 
6. cos x sin x 
1 
7. =a (= cosec? x) —cot x 
Sin® x 
1 
8. z C= Seer r) tan x 
cos? x 
9. sinh x cosh z 
10. cosh x sinh x 
1 
11. —,.— (= cosech? x) —coth x 
sinh’ x 
1 
l2 (= sech? r) tanh x 
cosh“ x 
i5 1 ies farc sin x 
VIa? di | —arc cos x 
1 arc tan x 
14. - 
1+2 —arccotx 
1 a 
15. ——— ar sinh x = log (x + V1 +2?) 
VI +22 
1 See 
16. (z| > 1) ar cosh a = log (x + Vx? — 1) 
+V x 
i Gun ki l +z 
je] < ar tan vE 
17. ——; 
1 — 2? i 1 a +1 
z| > 1 ar coth x = = log=—— 
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3.8 Table of Elementary Integrals 


To each of the differentiation formulas proved earlier there corre- 
sponds an equivalent integration formula. Since these elementary 
integrals are used time and again as materials for the art of 
integration, we collect themina Table. The right-hand column con- 
tains a number of elementary functions and the left-hand column 
the corresponding derivatives. If we read the table from left to right, 
we obtain in the right-hand column an indefinite integral of the 
function in the left-hand column. 

We also remind the reader of the fundamental theorems of the 
differential and integral calculus, proved in Section 2.9, in particular, 
of the fact that any definite integral is obtained from the indefinite 
integral F(x) by the formula? 


[te dx = F(2) 


In the following sections we shall attempt to reduce the calculation of 
integrals of given functions in some way or other to the elementary 
integrals collected in our Table. Apart from special artifices which 
are learned only from experience, this reduction is based essentially on 
two useful methods: “substitution” and “integration by parts.” Each 
of these methods enables us to transform a given integral in many ways; 
the object of such transformations mostly is to reduce the given 
integral, in one step or in a sequence of steps, to one or more of the 
elementary integration formulas given above. 


E 


3.9 The Method of Substitution 


Integrating Compound Functions 


The first of these methods is the introduction of a new variable 
(that is, the method of substitution or transformation). It aims at 
reducing the integration of composite functions—such as functions of 
x — c or of ax + b—to that of simpler functions. 


a. The Substitution Formula. Integral of a Composite Function 


The rule for integrating composite functions follows from the corre- 
sponding chain rule for differentiation. For a composite function 


1 We shall not discuss in this chapter the somewhat different problem of calculating 
special definite integrals without first finding a general primitive function. 
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G(u) = F[(u)] we have (see p. 218) 


dG(u) _ dFIdw)] 


(16) 
du du 


= F'[d(u)]¢'(w). 

It is sufficient for the validity of this formula that the functions x = d(u) 
and F(x) are continuously differentiable in their arguments u, x 
respectively, and that F(x) is defined for the values x assumed by the 
function x = ¢(u) (that is, the range of the function ¢ must belong to 
the domain of F). Integrating the formula between the limits u = a and 
u = f, we find 


B 

I G(B) — Ga) = FIKA) — FId(@)] = | AOE 
If here i 
Hi Bees 


we have 
FIP] — FIK] = F) — Fla) = | FC) de 


Setting F(x) = f(x) we obtain the basic substitution formula 


v g 
(18) elire Í Aide a= 


or, written suggestively in Leibnitz’s notation with the differential 


dọ = d'(x) de, 
(18a) [ foe ae = | fi a9. 


Here x = ġ(u) may be any function which is defined and has a con- 
tinuous derivative in the interval J with end points x and ĝ; it maps 
those end points into x = a and x = b respectively; the function f(x) 
is assumed to be continuous in an interval / containing the images of all 
points of J under the mapping ¢. For F(x) we can take any primitive 
function of f(z). 

As should be noticed the substitution rule (18) does not require that 
the mapping zx = ¢(u) map points between « and f only on points 
between a and b or that different values u are mapped into different zx; 
all that matters is that x and ĝ are mapped into a and b and that f(x) 
is defined for the values x taken by ¢(u) for u between « and f. 

In terms of indefinite integrals the substitution rule takes the form 


(19) G(u) = |se du = {fc dx = F(x) = F[9(u)]. 
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The differential symbols 
d. 
¢ (u) du = a and dx 
du 


become identical if we formally cancel the symbols du in the numerator 
and denominator. 


Examples. We apply formula (18) to the integrand f(x) = 1/x and 
make the substitution x = ¢(u), assuming ¢(u) = 0 in the interval 
considered; then 


EO) du = | & = tog |e] = log 16(u) 
(u) x 
or changing the name of the variable u again into x, 
(x) 
(20) PT de = log |4(2)l. 
p(z) 


If in this important formula we substitute particular functions, such 
as ¢(x) = log x, f(x) = sin x, or g(x) = cos x, we obtain? 


f e log |log =|, 


x log x 


(21) 
feor x dx = log |sin z], [tan a dx = —log |cos x|. 


Further Examples. 
fowo du = | xdr = $x? = 3d6WP, 


where f(x) = x. This yields for ¢(u) = log u 


| 
(22) f $ f du = } (log u). 
We finally consider 


fsi u cos u du. 


Here x = sin u = ¢(u), and hence 


prti sin”?! u 
sin” u cos u du = fe dx = i 


ati” n+l 


1 These and the following formulas are easily verified by showing that differentiation 
of the result gives us back the integrand. 
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The same substitution x = sin u gives for any function f(x) continuous in 
the interval -1 <x <1 
B sin 2 
l f(sin u) cos u du = f() ae. 
x sin a 


Taking here « = 0 and $ = 2v7 gives us an example for applying the substitu- 
tion formula to a case where the mapping function x = ¢(u) = sin u = = is 
not monotonic throughout the interval x < u < f}. We find 


) 


2r 0 
f(sin u) cos u du -| f(e) dx = 0. 
0 0 
Other Forms of the Rule 


In many applications the integral to be evaluated is given in the form 


F(u) = | Albu) du 


in which the integrand appears as the composite function A[ġ(u)] 
without the additional factor ġ'(u). We can apply the substitution rule 
(18) if we succeed in writing the integrand A[¢(u)] in the form f [ġ(u)]ġ (u). 
This can always be achieved under the assumption that the function 
x = ġ(u) has a continuous derivative ¢'(u) which does not vanish. For 
then there exists an inverse function u = y(x) with a continuous 
derivative du/dx = y'(x) = 1/¢'(u). Taking for f(z) the function 
h(x)y"(x) we have indeed A[¢(v)] = f[¢@)]/y'() = flu) (u) and we 


obtain from the substitution rule 
(23) fugeau = | sigang du = | f) ae 


= f h(xyy' (x) dx = f h(x) a dx. 
dx 


The assumption ġ'(u) ¥ 0 has been introduced in order to prevent the 
expression dx/du in formula (23) from becoming infinite. 

The beginner must never forget that in substituting u for y(x) in an 
integral one must not merely express the old variable x in terms of the 
new one, u, and then integrate with respect to this new variable; 
instead, before integrating one must multiply by the derivative of the 
original variable x with respect to the new variable u. This, of course, is 


a: 
suggested by Leibnitz’ notation h dx = h = du. In the definite integral 
u 


; 6 
| hl y(x)] dx -Í h(u)ġ'(u) du 


we must not forget to change the limits a, b for x into the corresponding 
limits « = y(a) and ĝ = (b) for the variable u. 
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Examples. In order to calculate f sin 2x dz we take u = y(x) = 2x and 
h(u) = sin u. We have 
du (®) =2 dx 1 
— = t) = — = -, 
de Y >" du 2 
If we now introduce u = 2x into the integral as the new variable, then it is 
transformed, not into f sin u du but into 


If. 1 1 
5 fin uau = — 5 60S u = — 5 COS 2m, 
this may, of course, be verified at once by differentiating the right-hand side. 

If we integrate with respect to x between the limits zero and 7/4, the cor- 
responding limits for u = 2x are zero and 7/2 and we obtain 


ie 1 ie 1 t]? ] 
sin 2x dx = - sin u du = —~cosu| =-. 
0 2 Jo 2 0 2 
| l i t dx 
Another simple example is the integral Ve . Here we take u = y(x) = 
1 x 


Vx, from which z = d(u) = u?. Since ¢’(u) = 2u, we have 


4 dx 2 udu 2 
— = | 2— =2] du =2. 
1 Vr 1 u 1 


As another example we consider the integral of sin 1/ for the interval 
4 <x <1. We have for u = 1/xorx =1/u,dx = —duļu?, and hence 
ae | 1 sin u 2 sin u 
sin- dx = — | —-du=] —> du. 
i x u 


2 2 
Lg 2 u 1 


*b, An Alternative Derivation of the Substitution Formula 


Our integration formula (17) with a slight change of notation can also 
be interpreted in a direct manner, based on the meaning of the definite 
integral as a limit of a sum instead of being deduced from the chain rule 
of differentiation.! To calculate the integral 


fhar 


(for the case a < b), we begin with an arbitrary subdivision of the 
interval a < x < b, and then make the subdivision finer and finer. 
We choose these subdivisions in the following way. If the function 

= y(x) is assumed to be monotonic increasing, there is a one-to-one 
correspondence between the interval a < x < b on the x-axis and an 


1 The result obtained in this way is again restricted to monotonic substitutions and 
thus is less general than formula (18) furnished by the chain rule (on p. 265). 
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interval « < u < B of the values of u = y(x), where « = y(a) and 
B = y(b). We divide this x-interval into n parts of length’ Az; there is 
a corresponding subdivision of the u-interval into subintervals which, 
in general, are not all of the same length. We denote the points of 
division of the x-interval by 


f= Ay Wess ty =D 
and the lengths of the corresponding u-cells by 
Au,, Au,,..., Au 


n° 


The integral we are considering is then the limit of the sum 


> A{p(é,)} Az, 

v=] 
where the value &, is arbitrarily selected from the vth subinterval of the 

n A 
z-subdivision. This sum we now write in the form > A(v,) = Au,, 
- v=] u, 

where v, = y(&,). By the mean value theorem of the differential 
calculus Az/Au, = $'(7,), where 7, is a suitably chosen intermediate 
value of the variable u in the th subinterval of the u-subdivision and 
x = ġ(u) denotes the inverse function of u = y(x). If we now select 


the value £, in such a way that v, and 7, coincide, that is, n, = p(¢,) 
E, = ¢(7,), then our sum takes the form 


$ AdE O) Au, 


If we here make the passage to the limit letting n — 00, we obtain the 


expression 
B dx 
h(u) — du 
| E7 


a 


as the limiting value, that is, as the value of the integral we are con- 
sidering, in agreement with formula (23) given before. 
Thus we arrive at the following result. 


THEOREM. Let h(u) be a continuous function of u in the interval 
a<u<p. Then if the function u = y(x) is continuous and monotonic 
and has a continuous nonvanishing derivative duļdx in a < x < b, and 


1 The assumption that the lengths of these subintervals are all equal is by no means 
essential for the proof. 

2 This limit exists (for Ax — 0) and is the integral, since on account of the uniform 
continuity of u = y(x) the greatest of the lengths Au, tends to zero with Az. 
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y(a) = a and y(b) = £, then 


ILZO dx = f K dx = fao d iy: 


This derivation exhibits the suggestive merit of Leibnitz’s notation. 
In order to carry out the substitution u = y(x), we need only write 
(dx|du) du in place of dx, changing the limits from the original values of x 
to the corresponding values of u. 


c. Examples. Integration Formulas 


With the help of the substitution rule we can in many cases evaluate 
a given integral f f(x) dx if we reduce it by means of a suitable sub- 
stitution x = ġ(u) to one of the elementary integrals in our Table. 
Whether such substitutions exist and how to find them are questions 
to which no general answer can be given; this is rather a matter in 
which practice and ingenuity, in contrast to systematic methods, come 


by means of the 


into their own. pA 
As an example, we evaluate the integral f Tao 
a*— x 


substitution’ x = ¢(u) = au, u = y(x) = z/a, dx = a du, by which, 
using No. 13 of our Table we obtain 


(24) 


dx a du f x£ 
| = | AR = arcsin u = arcsin? for |x| < Jal. 
Væ — x? avi — u? a 


By the same substitution we similarly obtain 


(25) f d f 2S = arctan u aha, 


EF a +u a a a 
(26) sa = ar sinh Ž, 
Ja + r? a 


(27) [= = ar cosh — for |z| > lal, 
Ve- a? a 


1 x 
— ar tanh - for |x| < Jal, 
a 


(28) Í dx ee he 


2 2 

a° — x 1 x£ 

— ar coth — for |z| > lal, 
a a 


1 For the sake of brevity we again take the liberty of writing the symbols dx and du 
separately, that is, de = $’(u) du instead of dx/du = $’(u) (cf. p. 180). 
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formulas which occur very frequently and which can easily be verified 
by differentiating the right-hand side. 

3.10 Further Examples of the Substitution Method 


In this section we collect a number of examples which the reader may 
consider carefully for practice. 


By the substitution u = 1 + x?, du = + 2x dx, we deduce that 


(29) Í og! VT aa 
a SA a 
VI +72 
30 = | 2 
(30) Taw = thlogil +2"). 


In these formulas we must take either the plus sign in all three places or the 
minus sign in all three places. 
By the substitution u = ar + b, du = a dr (a # 0), we obtain 


31 came b 
Gh ae ge ee 
j| 
r apn : atl = 
(32) fie + b)* dx ded Ti + b) (x # —1), 


(33) fsin (ar + b)de = — . cos (ax + b); 

similarly, by means of the substitution u = cos v, du = —sin x dx, we obtain 
(34) [ean xdr = —log |cos x], 

and by means of the substitution u = sina, du = cos + d'r, 


(35) feor x dx = log |sin x| 


[cf. (21) p. 266]. Using the analogous substitutions u = cosh x, du = sinh x 
dx and u = sinh x, du = cosh x dx, we obtain the formulas 


(36) [anh x dx = log cosh zx, 


(37) [com a du = log |sinh r|. 
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By virtue of the substitution u = (a/b) tan x, du = (a/b) sec? x dx, we arrive 
at the two formulas 


dx 1 l dx 
(38) Jr Ze ee See 


asin? x + 6% cos? bè] (a/b?) tan? x + 1 cos? x 


1 a 
—arctan{-tanz 


ab b 
i : t 2 t l 
ab arc co (5 an x 
and 
1 h a 
Ji Er ar tan 7 tan x 
oa ja sin? x — B® cos?x 1 i 
— — ar coth | - tan x 
ab (; 


We evaluate the integral 


dx 
sin x 


by writing sin x = 2 sin (x/2) cos (x/2) = 2 tan (x/2) cos? (x/2), and putting 
u = tan (x/2), so that du = 4 sec? (x/2) dx; the integral then becomes 


dx du 
(40) Í =|— = log 


sin x u 


£ 


2 


tan 


If we replace x by + + 7/2, this formula becomes 


dx a Lo n 
(41) mo og |tan (; +7)| 


The substitution u = 2x yields, if we also apply the known trigonometrical 
formulas 2 cos?x = 1 + cos 2x and 2sin?x = 1 — cos 2x, the frequently 
used formulas 


(42) Í cos? x dx = }(x + sin x cos x) 
and 
(43) [sin x dx = (x — sin x cos 2). 


By the substitution x = cos u, equivalent to u = arc cos x, or, more 
generally, x = a cos u (a # 0), we can reduce 


{vi —a'dx and [ve — r° dz 
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respectively to these formulas. We thus obtain 
—— ae 2 ——__ 
(44) Naat ies eee ee poe 
. 2 a 2 
Similarly, by the substitution z = a coshu we obtain the formula 


hit 2 qq ° V—_—_—_——————— 
(45) [Vi =a de = —Larcosh +2 Vee 
a ra 


and by the substitution x = a sinh u 


a 2 iann 
(46) | Va? + x dx = Ê ar sinh Ë + Ž Va? + r’. 
R 2 a 2 

The substitution u = a/x, dr = —(a/u?) du leads to the formulas 
(47) | n= -arcsin t, 

ryz — a? a x 
(48) | Sa = -tarsi £, 

rx? + a? a x 
(49) | = -Larcosn t. 

ava — r’ a x 


Finally, we consider the three integrals 


| cos mx cos nx dx, 


[sin mx sin nx dx, Í sin mx cos nx dx, 


where m and n are positive integers. By well-known trigonometrical 
formulas we can divide each of these integrals into two parts, writing 
sin mz sin nx = 4[cos (m — n)x — cos (m + n)z], 
sin mz cos nz = ł[sin (m + n)x + sin (m — n)z], 
cos mx cos nx = $[cos(m + n)x + cos (m — n)z]. 


If we now make use of the substitutions u = (m + n)x and u = 
(m — n)x respectively, we obtain directly the following system of 
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formulas: 


(50) 

“(= (m—n)x _ sin(m+ ne ip sea. 
2 m— n m+n 
[si mz sin ng dz = 


1 ( snem) l 
= {xz — —— if m = n; 
2 2m 
(51) 
— 5 eee tne cost) ee 
, 2 m+n m— n 
[sin mx cos nz dx = 
(208 2m) 
— -|—_—— ifm = n; 
2 2m 
(52) 
5 inln tne 5 sinen Emah: 
2 m+n m— n 
fos mz cos nz dz = 
1 (= 2mx 
— | +r ifm =n. 
2 2m 


If, in particular, we integrate from —z to +7, we obtain from these 
formulas the extremely important relations 


+r 0O ifmæÆn, 
Í sin mz sin nx dx = 


T ifm = n, 


=m 


+r 
(53) f sin mz cos nx dx = 0, 


=p 


+r 0 if m #Æn, 
Í cos mz cos nz dz = 


-r 


m ifm=n. 
These are the orthogonality relations of the trigonometric functions, 


which we shall encounter again in Section 8.4e. 


3.11 Integration by Parts 


a. General Formula 


The second widely used method for dealing with integration problems 
expresses in integral form the rule for differentiating a product: 


(fg) =f 8g + Sg’. 
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The corresponding integral formula is (cf. p. 189) 


f(x)g(x) = [ew (a) dx + f (x)g'(x) dx 


or 


(54) E (x)g'(x) dx = f(x)g(x) — feos (x) dx. 


Using Leibnitz’s differential notation, this becomes 


(54a) [Fae = fe - fs df . 


This formula will be referred to as the formula for integration by parts. 
It reduces the calculation of one integral to the calculation of another 
integral. Since a given integrand can be regarded as a product f(x)g'(x) 
in a great many different ways, this formula provides us with an effective 
tool for the transformation of integrals. 

Written as a formula for definite integration, the formula for inte- 
gration by parts is 


(54b) Í f(æ)g'(2) dx = f(a)g(x)| — | g(a)f'(a) dx 


EOR OE | o(x)f'(a) de. 


This follows either directly by integrating the formula for the derivative 
of a product between the limits a and b or by forming the difference at 
the points b and a in formula (54). 

We can give a simple geometrical interpretation of formula (54b): 
Let us suppose that y = f(x) and z = g(x) are monotonic, and that 
f(a) = A, f(b) = B, g(a) = «, g(b) = B; we can then form the inverse 
of the first function and substitute in the second equation, thus obtaining 
z as a function of y. We assume that this function is monotonic increas- 
ing. Since dy = f'(x) dx and dz = g'(x) dx the formula for integration 
by parts can be written [cf. the substitution rule (18), p. 265). 


B B 
[vas + | zdy = Bp — Ax, 
a vd 


in agreement with the relation made clear by Fig. 3.27, 


area NOLK + area PMLQ = area OMLK — area OPQN. 
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Figure 3.27 


The following example may serve as a first illustration: 


fog. dx = {tog ‘1 dz. 


We write the integrand in this way in order to indicate that we put f(x) = log x 
and g'(x) = 1, so that we have f'(x) = 1/2 and g(x) = x. Our formula then 
becomes 


(55) flogs de = rioga — f Zar = zloga — a. 


This last expression is therefore the indefinite integral of the logarithm, as 
may be verified at once by differentiation. 


b. Further Examples of Integration by Parts 


With f(x) = x, g'(x) = e”, we have f '(x) = 1, g(x) = e”, and 


(56) fe e dx = e*(x — 1). 

In a similar way we obtain 

(57) f- sin x dre = —x cosx + sinx 
and 


(58) f- cos x dx = x sin x + cos z. 
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For f (x) = log x, g'(x) = x°, we have the relation 


gat l 
59 x log x dx = — —]}. 
7 meee = (o2 =) 
Here we must assume a # —1. Fora = —1 we obtain 


‘i iual ; dro 
zlog x x = (log x} — logr:— ; 


transferring the integral on the right-hand side over to the left, we have [cf. 
(22), p. 266] 


l 
(60) Í : log x dx = $(log x}. 


We calculate the integral f arc sin x dr by taking f(x) = arc sin x, g'(x) = 1. 
Hence 
, ; x dx 
[are sina dx = x arc sina - | $ 
y] — r? 


The integration on the right-hand side can be performed as in (29), p. 271; 
we thus find 


(61) [are sina dx =«xarcsinx + V1 — a? 
In the same way we find 
(62) farc tan x dx = x arc tan x — 4 log (1 + x?) 
and many others of a similar type. 

The following examples are of a somewhat different nature; here repeated 
integration by parts brings us back to the original integral, for which we thus 
obtain an equation. 


In this way we obtain 


1 a 
[er sin be = — ge cos bz +5 fee cos be de 


1 2 a zr . a? P at ya e 
= — pl cos bx tpe” sin bx — p e sin bx dx; 
Solving this equation for the integral f e°” sin bx dz, 
1 
(63) fe sin bx dx = P e"*(a sin bx — bcos bz). 


In a similar way it follows that 


e"*(a cos bx + b sin bz). 


l 
(64) fe cos bz dx = Poa Be 


278 The Techniques of Calculus Ch. 3 


c. Integral Formula for f(b) + f (a) 


As a last example we derive a remarkable formula expressing the sum 
f(b) + f(a) as a definite integral (instead of the difference f(b) — f(a) given 
by the fundamental formula). Integration by parts will be applied by 
introducing 1 = g'(x), where g(x) = x — m with a constant m at our disposal. 
Then we have for the indefinite integrals 


Í fŒ) dx -f f €X — m) dx = f(x) — m) 


and for the integral between a and b 


b b 
f fŒ) dx +| f’@)e — m) de = f(bXb — m) — f(a)la — m). 


If for arbitrary a and b we choose for m the mean value m = (a + b)/2, 
between a and b, we obtain, as the reader will easily verify 


b — i i 
> (fla) + fb) =| fŒ) de +f (x — mf’) de. 


d. Recursive Formulas 


In many cases the integrand is not only a function of the independent 
variable but also depends on an integer index n; on integrating by 
parts we sometimes obtain, instead of the value of the integral, another 
similar expression in which the index n has a smaller value. We thus 
might arrive after a number of steps at an integral which we can deal 
with by means of the Table of Integrals, p. 263. Such a process is 
called recursive. 

The following examples are illustrations: By repeated integration by 
parts we can calculate the trigonometrical integrals 


[cos x dx, [sine x dx, [sin x cos” x dz, 


provided that m and n are positive integers. For using f(x) = cos"? x, 
g(x) = sin x we find for the first integral that 


feos x dr = cos™? x sin x + (n — 1) feos x sin? x dx; 
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the right-hand side can be written in the form 
cos”) x sin x + (n — 1) feos" x dx — (n — 1) feos" x dz; 


thus a recursive relation is obtained : 


(65) [cos x dx = Leost z sin z + — L [cose x dx. 
n n 


This formula enables us to diminish the index in the integrand step by 
step until we finally arrive at the integral 


[cos x dx = sin x or faz = 2, 


depending on whether n is odd or even. In a similar way we obtain the 
analogous recursive formulas 


(66) | sin xz dx = — l sin” 1 xz cos x + de [sinr-* x dx 


n n 
and 
(67) 
sin™t! acos™ 12  n-— | 
fsin” z cos? z dr = = sin” z cos™? x dx. 
m+n m+n 


In particular, we calculate the integrals 


sin? zaz = AG — sin x cos x) 
and 

[cos xdxz= s(x + sin x cos 2), 
as we have already done by the method of substitution [Eqs. (42), (43), 
p. 272]. 


It need hardly be mentioned that the corresponding integrals for the 
hyperbolic functions can be calculated in exactly the same way: 


(68) | sinn* x dz = >(= + sinh x cosh 2), 


(69) [cosh xdr = AG + sinh x cosh z). 
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Further recursive formulas are given by the following transformations: 


(70) fog x)™ dx = x(log x)” — m | (og x£)” dx, 
(71) fere dx =x™e™ — m { amle dx, 
(72) fe sin xz de = —x™ cosx + m farm cos x dx, 


(73) fer cos x dx = x™ sing — m far sin x dv, 


a ee x*tl(log x)” 


ya m—1 = 
4 [= (log x) dx (a # —1). 


e. Wallis’s Infinite Product for 7 


The recursive formula for the integral f sin” x dx with n > 1 leads 
to a fascinating expression for the number 7 as an “infinite product.” 
In the formula 


| ene n—1 ; 
[sine zaz = — > şin”! z cos x + sin’? x dx 
n n 


we insert the limits 0 and 7/2, thus obtaining 


Uss n—1 [7"., 
(75) Í sin” x dx = Í sin” * x dx forn > 1. 


0 n 0 


If we repeatedly apply the recursive formula, we obtain, distinguishing 
between the cases n = 2m and n = 2m + 1, 


1/2 = a 7/2 
(76) i magda am cal LL] ia 
0 2m m—2 2 Jo 
7/2 x, 7/2 
(76a) Í se pip oe E Í sin x dz, 
0 2m +1 2m—1 3 0 
whence 
7/2 _ Z 
(77) Í sin?" 2 de = 2 P a E 
0 2m 2m —2 2 2 
7/2 as 
(77a) i sin?” t! 2 dz = _2m_ 2m-2 2 
0 2n+1 2m—1 3 
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By division this yields 


tdr 


SSe One Te a Nie nomi 


The quotient of the two integrals on the right-hand side converges 
to | as m increases, as we recognize from the following considerations. 
In the interval 0 < x < 7/2, where 0 < sinx < 1, we have 


0 < sin 1 2s < sin? x < sin?” ' 2; 


consequently, 


7/2 7/2 7/2 
0 <| sin?” x dx <| sin?” x dx <| sin?! x dz. 
0 


0 0 


7/2 
If we here divide each term by sin?™+1 x dx and notice that by 
formula (75) 


å 2m + | = 1 
sin?” *) x dx ani am 
0 
we have 
7/2 
Í sin?” x dx 
1 < Es 
7/2 = ? 
Í sin?™”t! x dx 2m 
0 
from which the above statement follows. 
Consequently, the relation 
T . 224466 2m 2m 


(79) - = lim ŽŽ- ---...-. 
2 m=»%133557  2m—12m+1 
holds. 


This product formula (due to Wallis), with its simple law of formation, 
gives a most remarkable relation between the number ~ and the integers. 


Product for Ja 


As an easy consequence we can derive an equally remarkable ex- 
pression for Vr. If we observe 


2m 
lim p 
m> w 2m + 1 
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we can write 
2. 42 _ 42 
im EE On = D T, 
m> 3 5? (2m —_— 1)? 2 


taking the square root and then multiplying the numerator and 
denominator by 2-4--- (2m — 2), we find 


ee 2 —~ 7) —_ 
[= im 24 Om = yn = lim ae 


moa 3'5 (2m — 1) m> (2m — 1)! 
22. 4?--- (2m)? /2m 


= lim . 
oe (2m)! 2m 
tim E D PZ + 32) (2? m’ 
(2m)! /2m 
From this we finally obtain 
(80) lim Le = fr, 
is (2m)!,/m 


a form of Wallis’s product which will be of use to us later (cf. Chapter 6, 
Appendix). 


*3.12 Integration of Rational Functions 


During the seventeenth and eighteenth centuries mathematicians 
were preoccupied with discovering classes of elementary explicit 
functions which could be integrated explicitly. A wealth of ingenious 
devices was invented and at the same time the basis for deeper under- 
standing created. When one later realized that achieving integration of 
all explicit functions in closed form was neither an attainable nor 
really an important goal, the tedious technicalities which had been 
developed in connection with such problems were gradually deempha- 
sized. Yet, a significant general result remained: 


All rational functions R(x) of a variable x can be integrated explicitly 
in terms of the elementary integrals listed in Table 3.1. 


This general result can be obtained much more easily in the context 
of the more advanced theory of functions of a complex variable. 
Yet, it is still worthwhile to sketch an elementary derivation employing 
only real variables. 
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The rational functions are those of the form 


(81) kars 
g(x) 


where f(x) and g(x) are polynomials: 
f(x) = aa” + An P + Er + Qo, 
g(x) = b,x" T bya"! sae ls 5 bo (b, Æ 0). 


As we recall, every polynomial can be integrated at once and its 
integral is itself a polynomial. We therefore need consider only those 
rational functions for which the denominator g(x) is not a constant. 
Moreover, we can always assume that the degree of the numerator is 
less than the degree n of the denominator. For otherwise, dividing the 
polynomial f(x) by the polynomial g(x), we obtain a remainder of 
degree less than n; in other words, we can write f(x) = g(x)g(x) + r(x), 
where q(x) and r(x) are also polynomials and r(x) is of lower degree than 
n. The integration of f(x)/g(x) is then reduced to the integration of the 
polynomial g(x) and of the “proper” fraction r(x)/g(x). We notice 
further that the function f(x)/g(x) can be represented as the sum of the 
functions a x*/g(x), so that we need only consider integrands of the 
form x”/g(x). 


a. The Fundamental Types 


We proceed in steps to the integration of the most general rational 
function of the type (81), studying first only those functions with de- 
nominator g(x) of the particularly simple type 


g(x) = 2", 
or 
g(x) = (1 + z?)", 


where n is any positive integer. 

To this case we can then reduce the somewhat more general case in 
which g(x) = (ax + B)", a power of a linear expression ax + f 
(a Æ 0), or g(x) = (ax? + 2bx + c)", a power of a definite’ quadratic 


1 A quadratic expression Q(x) = ax? + 2bc + c is said to be definite if for all real 
values of x it takes values having one and the same sign, that is, if the equation 
Q(x) = 0 has no real roots. For this it is necessary and sufficient that the ‘‘dis- 
criminant”’ ac — b? is positive. This follows, of course, from the explicit formula 


(—b + Vb? —ac)/a for the roots. Equivalently, a definite quadratic expression 
is one that cannot be factored into two real linear factors. 
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expression. If g(x) = («x + f)" we introduce £ = ax + B as a new 
variable. Then d&/dx = a, and x = (& — f)/« is also a linear function 
of &. Each numerator f(x) becomes a polynomial (£) of the same 
degree, and consequently, 


A gel [4D gp 


(ax + f)" a E" 
In the second case, we write 


2 
Perey eer ee ere) at (d = ac — b°, d > 0); 
a a 


since we have assumed our expression to be quadratic and definite, 
ac — b must be positive and a # 0. By introducing the new variable 


+b 
(ae cole 
d 


we arrive at an integral with the denominator [(d?/a)(1 + J". 

Hence in order to integrate rational functions whose denominators 
are powers of a linear expression or of a definite quadratic expression 
it is sufficient to be able to integrate the following types of functions: 


1 g” gtl 
oo (+D e+ 

We shall, in fact, see that even these types need not be treated in general, 
for we can reduce the integration of every rational function to the 
integration of the very special forms of these three functions obtained 
by taking » = 0. Accordingly, we now consider the integration of the 


three expressions 


] 1 z 
y” ? (x? + 1)” 2 (x? + 1)” : 


b. Integration of the Fundamental Types 


Integration of the first type of function, 1/2", immediately yields the ex- 
pression log |x| if n = 1, and the expression —1/(n — 1)z"—1 if n > 1, so that 
in both cases the integral is again an elementary function. Functions of the 
third type can be integrated immediately by introducing the new variable 
£ = x? + 1, from which we obtain 2x dx = dé and 


$ log (a? + 1) ifn = 1, 


f x£ ? a 
— dr = - — = 1 
@aep (2) pa KP 
In — D@ +) ifn >l. 
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Finally, in order to calculate the integral 


6 = [asp EED 


where n has any value exceeding one, we make use of a recursive method: 
If we put 


a a =: o OS 


C+D FD F 


dx x? dx 
(x? + 1)" + 1)" -|aSn + 1)" 1 - + 1)" 2 


we can transform the right-hand side by integrating by parts, using formula 
(54) on p. 275 with 


so that 


fe) =r, g'(x) = 


Then, as we have just found, 


r 
(x? + 1)” ° 


— 


l 
ge) = -in eE 


and consequently, we obtain 


x $ 2n — 3 dx 
sop + T n= +" 2H — 1) j E 
The calculation of the integral /, is thus reduced to that of the integral J,,_ 


If mn — 1 > 1 we apply the same process to the latter integral, and continue 
until we finally arrive at the expression 


dx 
~~~ = arc tan z. 
xr? + ] 


We thus see that the integral’ J, can be explicitly expressed in terms of 
rational functions and the function arc tan z. 

Incidentally, we could also have integrated the function ] /(2? + 1)" directly, 
using the substitution x = tan t; we should then have obtained dx = sec? t dt 
and 1/(1 + x?) = cos? t, so that 


lace Gai = Í cos 2"~2y dt, 
and we have already learned [Eq. (65) p. 279] how to evaluate this integral. 


1 The integral of the function 1/(z* — 1)" can be calculated in the same way; by the 
corresponding recurrence method we reduce it to the integral 


ae. = ar tanh v (or ar coth zv). 
1-2 
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c. Partial Fractions 


We are now in a position to integrate the most general rational 
functions. We make use of the fact that every such function can be 
represented as the sum of so-called partial fractions, that is, as the sum of 
a polynomial and a finite number of rational functions, each one of 
which has either a power of a linear expression for its denominator and 
a constant for its numerator, or else a power of a definite quadratic 
expression for its denominator and a linear function for its numerator. 
If the degree of the numerator f(x) is less than that of the denominator 
g(x), the polynomial does not occur. We know already how to inte- 
grate each partial fraction. For according to p. 284 the denominator 
can be reduced to one of the special forms x” and (x? + 1)", and the 
fraction is then a combination of the fundamental types integrated on 
p. 284. 

We shall not give the general proof of the possibility of this resolution 
into partial fractions. We shall merely confine ourselves to making 
the statement of the theorem intelligible to the reader and to showing 
by examples how the resolution into partial fractions can be carried out 
in typical cases. In practice only comparatively simple functions are 
dealt with, for otherwise the computations become too cumbersome. 

As we know from elementary algebra, every real polynomial g(x) can 
be written in the form? 


g(x) = a(x — a,)(a — a) 


-ee (x? + 2bix + c) (x? + 2box + ce) 


Here the distinct numbers «,, a), ... are the real roots of the equation 
g(x) = 0, and the positive integers /,, /,,... indicate the multiplicity 
of these roots; the factors z? + 2b x + c, indicate definite quadratic 
expressions, of which no two are the same, with conjugate complex 
roots, and the positive integers r}, r2,... give the multiplicity of these 
roots. 

We assume that the denominator is either given to us in this form or 
that we have brought it to this form by calculating the real and 
imaginary roots. Let us further suppose that the numerator f(x) is of 
lower degree than the denominator (cf. p. 283). Then the theorem on 
resolution into partial fractions can be stated as follows: For each 


1 The actual proof of this so-called fundamental theorem of algebra does not belong 
to algebra. It is achieved most easily by methods belonging to the theory of functions 
of a complex variable. 
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factor {x — a)', where « is any one of the real roots of multiplicity /, 
one can determine an expression of the form 


i i re A; 


r—a (x-a (x — a) ` 


and for each quadratic factor Q(x) = x? + 2b2 + c in our product 
which is raised to the power r we can determine an expression of the 
form 


B, + Cır | By + Coz 4a 
ee ope 
Q Q Q 


in such a way that the function f(x)/g(x) is the sum of all these expres- 
sions (A,, B,, C, are constants). In other words, the quotient f(x)/g(x) 
can be represented as a sum of fractions, each of which belongs to one 
of the types integrated above.’ 


In particular cases the decomposition into partial fractions can be done 
easily by inspection. If, for example, g(x) =x? — 1, we see at once that 


l i l 1 l 


dx i 


1 We give a brief sketch of a method by which the possibility of this decomposition 
into partial fractions can be proved without using the theory of functions of complex 
variables, once g(x) can be factored completely into linear factors. If g(x) = 
(x — a)*h(x) and h(x) Æ 0, then on the right-hand side of the equation 


[© fœ 1 fhlaa) — f(a)h) 


ga) hoea ha) Œ AAR) 


so that 


x — | 
£x +l 


the numerator obviously vanishes for z =a; it is therefore of the form 
h(x)(x — a)”fi(x), where f(x) is also a polynomial, the integer m > 1, and f(a) # 0. 
Writing f(«)/h(a) = f, this gives us 


f@ B _ fi) 


—— ee |= 


g(x) (@— at (x — a A(z) | 


Continuing the process, we can keep on diminishing the degree of the power of 
(x — a) occurring in the denominator until finally no such factor is left. On the 
remaining fraction we repeat the process for some other root of g(x), and do this as 
many times as g(x) has distinct factors. By doing this not only for the real but also 
for the complex roots, and by combining conjugate complex fractions we eventually 
arrive at the complete decomposition into partial fractions. 
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More generally, if g(x) = (x — a)(x — $), that is, if g(r) is a nondefinite 
quadratic expression with two real zeros « and f, we have 
l a. 1 l 1 
@ = a-p) (@- PE- -pe p) 


so that 


hias la 
@ —ae—p) ap 


d. Examples of Resolution into Partial Fractions. 
Method of Undetermined Coefficients 


If g(x) = (x — ay)(@ — a): (x —a,), where «, Ho, if IF k, 
that is, if the equation g(x) = 0 has only simple real roots, and if f(x) 
is any polynomial of degree <n, the expression in terms of partial 
fractions has the simple form 


f(x) _ ay ay +e + a, f 
g(x) r=% Fa, £ — Oo, 
We obtain explicit expressions for the coefficients a}, a»... if we 


multiply both sides of this equation by (x — «,), cancel the common 
factor (x — «,) in the numerator and denominator on the left and in the 
first term on the right, and then put x = a. This gives 


F(a) 
(x a Ot (a, = a3) _ (a, == a,) 
The reader will observe from the rule for the derivative of a product 
that the denominator on the right is g'(x), that is, the derivative of the 


function g(x) at the point x = a,. Similar formulas for ag, a3,..., 
obtained in this way, lead to the explicit partial fraction expansion 


f(*) Z f(a) f(a) a ahaa ee fxn) l 
g(x) g(r — a)  g'(a(xr — xə) g (Xn )(£ — «,) 


As a typical example of a denominator g(x) with multiple roots, we consider 
the function 1/[x?(x — 1)]. It has a representation 


1 a b c 
s(x —1) «2-1 P 


in accordance with p.287. If we multiply both sides of this equation by 
x*(x — 1), we obtain the equation 


1 = (a + b)? — (b — c) — c, 
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true for all values of x, from which we have to determine the coefficients a, b, 
c. This condition cannot hold unless all the coefficients of the polynomial 
(a + b)x? — (b —c)x — c — 1 are zero; that is, we must have a +b = 
b—c=c+1!1=0 or c=~1, b= —l, a=1. We thus obtain the 
resolution 


and consequently, 
nes 1) ~1 
ar — 1) SEF al eB T` 
Next we decompose the function 1/[e(x? + 1)] whose denominator has 
complex zeros in accordance with the equation 
1 a be+e 
pe an oTo f 
wre +l) a ve +l 


For the coefficients we obtain a + b =c =a — 1 = Q so that 


È = l p E ] £ 2 } l 


As a third example we consider the function 1/(4 + 1), whose integration 
was a challenge even in Leibnitz’ time. We can represent the denominator 
as the product of two quadratic factors :? 

at +1 = (a? +1)? — 20? = (22 + 1 + Vra + 1 — V 22). 
We know therefore that the resolution into partial fractions will have the form 
l ax +b I cx +d 
ipl a4 V2e 4 22 — V2 $1 
To determine the coefficients a, b, c, d, we use the equation 
(a +c)? + (b +d —av2 4+cV2)x? 
+(atc—bV24+dv2)e +(b+d—1) =0, 
1 The factorization of x4 + t into real quadratic factors corresponds to the factori- 
zation into conjugate complex linear factors 


xi + 1 = [(x — eyx — jix — 8 )\(x — €-)], 
where 


poe 


cos = + isin- VI + i) 
= — l = =—- V 
i 4 EAR 


is one of the eighth roots of +1, and a fourth root of —1 (see p. 105). 
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which is satisfied by the values 


1 1 
a = —, b= ’ C= +=, d = 
2V2 ; 2V2 ? 
We therefore have 
1 1 x+ v2 1 re 


ati 2v2 a+ V2en+1 2V2 a2 — V +1 


and, applying the method given on p. 284, we obtain 


dx 1 ~ l 5. 
= ——lo jx? + V2xe + 1| — ——lo |x? — V2x + 1| 
i 4v2 © v3 > 


p tan (V2x + 1) + : tan (V2x — 1) 
— arc tan x — arc tan x — 1), 
2V2 2V2 


which may easily be verified by differentiation. 

The preceding examples illustrate a general method of integrating a 
rational function f(x)/g(x). We first divide and are reduced to the case where 
the degree of f is less than that of g. We factor g(x) into linear and definite 
quadratic factors, grouping the product into powers of such factors. We 
write down the appropriate partial fraction representation for f/g with 
indeterminate coefficients a, b,c,.... Multiplying through with g(x) and 
comparing coefficients of equal powers in the resulting polynomial identity, 
we obtain a system of linear equations for the unknown coefficients that 
should just be adequate for determining those coefficients, if we really have 
the correct form for the partial fraction expansion. We are then ready to 
integrate any of the resulting partial fractions by the rules discussed before. 


3.13 Integration of Some Other Classes of Functions 


a. Preliminary Remarks on the Rational 
Representation of the Circle and the Hyperbola 


The integration of some other general classes of functions can be 
reduced to the integration of rational functions. We shall be able to 
better understand this reduction by first stating certain elementary 
facts about the trigonometric and hyperbolic functions. If we put 
t = tan (x/2), elementary trigonometry yields the simple formulas 
; 2t 1—1° 
sin vt = — cos t = > 


+e l+t 


1 Sometimes called ‘‘uniformization.”’ 


Sec. 3.13 Integration of Some Other Classes of Functions 291 


indeed, from 


1 — 
i 


a 

cos*- and 
2 

and from the elementary formulas 


I oT 2 2 l 
sin x = 2 cos“ — tan - and cos x = cos? È — sin? = 
2 2 2 
we obtain these equations. They show that sin x and cos x can both be 
expressed rationally in terms of the quantity t = tan x/2. By differen- 
tiation we have 


to 


dt 1 _ i+? 
dx 2 cos? x/2 2. °° 
so that 
dx 2 
82) — a 
l dt 14+? 


hence the derivative dx/dt is also a rational expression in ¢. 


*The geometrical representation of our formulas and their geometrical 
meaning are given in Fig. 3-28. Here the circle u? + v? = 1 in a u,v-plane is 
shown. If denotes the angle TOP in the figure, then u = cos xand v = sina. 
The angle OSP with its vertex at the point u = —1, v = 0 is equal to 2/2, by 


v 


Figure 3.28 Parametric representation of the trigonometric functions. 
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a theorem in elementary geometry, and we can read off the geometrical 
meaning of the parameter / from the figure; ¢ = tan $x = OR where R is 
the “projection” from S of the point P of the circle onto the v-axis. If the 
point P starts from S and describes once the circle in the positive direction, 
that is, if z runs through the interval from —7 to +7, the quantity ¢ will run 
through the whole range of values from + œ to + exactly once. (Notice 
that the point S itself corresponds tot = +). We have here a representation 
of the general point (u, v) of the circle u? + v? = 1 in terms of the rational 
functions u = (1 — #)/(1 +2), and v = 2t/(1 + £7) of the parameter r. 
These formulas then define a rational mapping of the f-line onto the circle 
in the u,v-plane (which incidentally is the two-dimensional analogue of the 
stereographic projection of a sphere mentioned on p. 21). At the basis of 
this rational representation of the circle lies obviously the identity 


(t — 1)? + (21)? = (7? + 1)%. 


Curiously enough, this formula is of interest also in number theory since it 
generates for each integer ¢ Pythagorean integers a = t — 1, b = 21, and 

= {? + 1 which satisfy the identity a? + b? = c?, that is, determine a right 
triangle with commensurable sides. Thus ¢ = 2 gives rise to the well-known 
triple a = 3, b = 4,c = 5; fort = 4 we obtain a = 15, b = 8,c = 17, etc. 
It is remarkable, and, of course, no accident, that the same algebraic identity 
is of significance in such diverse contexts as integration in closed form, 
geometry, and number theory. Linking different fields in such a manner is 
the typical trend in modern mathematics, although our particular example 
goes back to antiquity. 


Similarly we may express the hyperbolic functions 
cosh x = }(e” + e”*) 


and sinh x = }(e” — e~*) as rational functions of a third quantity. The 
most obvious way is to put e” = 7, so that we have 


cosh x = (r4 =), sinh z = H+- ‘), 
2 T 2 T 
which are rational expressions for sinha and cosh x. Here again 
dx/dt = ljr is rational in 7. However, we obtain a closer analogy 
with the trigonometric functions by introducing the quantity ! = 
tanh (2/2) = (t — 1)/(7 + 1); we then arrive at the formulas 
EAA zA sinh a = 1 


ee he cae ae 


cosh x = 


By differentiating t = tanh (x/2) we obtain, as in Eq. (82) on p. 291, 
the rational expression 


(83) ae 3 
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UV 


u -v =l 
P 
t = tanh R] j 
< 5 
i I ; 
=] O T\1 


Figure 3.29 Parametric representation of the hyperbolic functions. 


for the derivative dx/dt. Here again the quantity ¢ has a geometrical 
meaning similar to that which it has for the trigonometric functions, 
as we see at once from Fig. 3.29. 

We have here a rational representation of the hyperbola uv? — v? = 1 
in the u,v-plane by means of the equations u = (1 + £) — £) 
v = 2t/(1 — t). The points on the right-hand branch of the curve are 
of the form u = cosh z, v = sinh x and correspond to values of t 
with |t| < 1. The other branch is obtained for |t| > 1. 

We now proceed to our integration problems. 


*h. Integration of R(cos x, sin x) 


Let R(cos x, sin x) denote an expression which is rational in the two 
functions sin x and cos x, that is, an expression which is formed rationally 
from these two functions and constants, such as 

3 sin? x + cos 2 
3 cos? x + sin z 
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If we apply the substitution ¢ = tan x/2, the integral 
| R cos x, sin x) dx 


is transformed into the integral 


omnes 2 
fr(; È N ) 2 at, 
P+ 1+1 
and under the integral sign we now have a rational function of t. 
Thus we have in principle obtained the integral of our expression, 


since we can now perform the integration by the methods of the pre- 
ceding section. 


c. Integration of R(cosh x, sinh x) 


In the same way, if R(cosh x, sinh x) is an expression which is rational 
in terms of the hyperbolic functions cosh x and sinh x, we can effect 
its integration by means of the substitution ¢ = tanh 2/2. Recalling 
Eq. (83), we have 


2 > 
| Rosh x, sinh x) dx = (H; = ; zh = -dt 
I-P 1- ři 
(According to a previous remark we could also have introduced 7 = e” 


as a new variable and expressed cosh x and sinh x in terms of 7.) 
The integration is once again reduced to that of a rational function. 


*d. Integration of R(x, V1 — z?) 


The integral f R(x, V1 — z?) dx can be reduced to the type treated in 
Section 3.136 by using the substitution 


x = COS U, V1 — 2 = sin u, dx = —sinu du; 


from this stage the transformation ¢ = tan u/2 brings us to the inte- 
gration of a rational function. Incidentally, we could have carried out 
the reduction in one step instead of two by using the substitution 


aT 5 2t dx —4t 
1+? dt (14+?) 
that is, we could have introduced ¢ = tan u/2 directly as the new variable 
and thereby obtained a rational integrand. 
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*e. Integration of R(x, Xi z? — 1) 
The integral f R(z, Vaz? — 1) dz is transformed by the substitution 


x = cosh u into the type treated in Section 3.13c. Here again we can 
arrive at our goal directly by introducing 


x— 1 u 
= = tanh =. 
a+] 2 


* f. Integration of R(x, 4 x? + 1) 


The integral f R(x, J (a? + 1)) dx is reduced by the transformation 
x = sinh u to the type considered in Section 3.13c (p. 294) and can 
therefore be integrated in terms of elementary functions. Instead of the 
further reduction to the integral of a rational function by the sub- 
stitution e“ = 7 or tanh u/2 = t, we could have reached the integral of a 
rational function in a single step by either of the substitutions 


lt vet +1 


x 


Tarpe, t 


*o, Integration of R(x, a ax? + 2br + c) 


The integral | R(x, Var? + 2bx + c) dx of an expression which is 
rational in terms of x and the square root of an arbitrary polynomial 
of the second degree in x can immediately be reduced to one of the 
types just treated. We write (cf. p. 284) 


9 
ac — b“ 


ax? + 2bx + c = Tii + b)? ae 
a 


If ac — b? > 0 we introduce a new variable & by means of the 
transformation & = (ax + b)/ac — b?, whereupon the surd takes the 
form V(ac — PXE + 1)/a. Hence our integral when expressed in 
terms of & is of the type of Section 3.13f. The constant a must here be 
positive in order that the square root may have real values. 

If ac — b? = 0, and a > 0, then by way of the formula 


` 


Var? + 2bx + c= Va(x +2] 
a 


we see that the integrand was rational in x to begin with. 
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If, finally, ac — b? < 0, we put £ = (ax + b)/ Vb? — ac and obtain 
for the surd the expression V(ac — bye — l)/a. If a is positive, our 
integral is thus reduced to the type of Section 3.13e; if, on the other 
hand, a is negative, we write the surd in the form 


V(b — act — £)/(—a) 


and see that the integral is thus reduced to the type of Section 3.13d. 


*h. Further Examples of Reduction to Integrals of Rational Functions 


Of other types of functions which can be integrated by reduction to 
rational functions we shall briefly mention two: (1) rational expressions 


involving two different square roots of linear expressions, R(x, Jax + b, 


Vax + B); (2) expressions of the form R(z, V (ax + b)/(ax + B)), 
where a, b, «, § are constants. In the first type we introduce the new 


variable £ = Var + £, so that ax + f = &, and consequently 


2 
pen sea p and Ae 


X a 
then 


fn 


| Re. Jax + b, Vax + B)dx 
= (r| — A i [ag* — (aß — ba)], £ 25 ii 
a a OL 


which is of the type discussed in Section 3.13g. 
If in the second type we introduce the new variable 


e 
ax +B 
we have 


gn GE TD z= ES ay dx ap — ba né", 


ar +B’ at" — a dé (a&" — a} 


and we immediately arrive at the formula 


"Jax + P) = fr( =E +b ) aß = ba ent gg 
fr(e [2 + B i af” — a : (a&” — a)” a i 


which is the integral of a rational function. 
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i. Remarks on the Examples 


The preceding discussions are chiefly of theoretical interest. In 
complicated expressions the actual calculations would be far too 
involved. It is therefore expedient to take advantage, when possible, 
of the special form of the integrand to simplify the work. For example, 
to integrate 1/(a? sin? x + b? cos? x) it is better to use the substitution 
t = tan x instead of that given on p. 294; for sin? x and cos? x can be 
expressed rationally in terms of tan x, and it is therefore unnecessary 
to go back tof = tan x/2. The same is true for every expression formed 
rationally from’ sin? x, cos? x, and sinzcosz. Moreover, for the 
calculation of many integrals a trigonometrical form is to be preferred 
to a rational one, provided that the trigonometrical form can be 
evaluated by some simple recurrence method. For example, although 


the integrand in f x'(\/1 — 22)" dz can be reduced to a rational form, 
itis better to write x = sin wand bring it to the form f sin" ucos”™*! u du, 
since this can easily be treated by the recurrence method on p. 279 
(or by using the addition theorems to reduce the powers of the sine and 
cosine to sines and cosines of multiple angles). 


For the evaluation of the integral 


du eee 
acos. + bsinz a , 


instead of referring to the general theory we write 


A =Væ@ +b? aiies atan. 
: A A 


The integral then takes the form 


1 dx 
A Jsin(x +0)’ 


and on introducing the new variable x + 0 we find [(cf. Eq. (40), p. 272)] that 
the value of the integral is 


xr +o 
tan 


2 


t 


l 
A? 


1 For sin z cos x = tan x cos? x can, of course, be expressed rationally in terms of 
tan r. 


298 The Techniques of Calculus Ch, 3 


Part C Further Steps in the Theory of Integral Calculus 


3.14 Integrals of Elementary Functions 


a. Definition of Functions by Integrals. 
Elliptic Integrals and Functions 


With the examples already given of types of functions which can be 
integrated by reduction to rational functions, we have practically 
exhausted the list of functions which are integrable in terms of ele- 
mentary functions. Attempts to express indefinite integrals such as 
(for n > 2) 


l dx 
Vay + azt t+ a,x” 


[Va T at + Pe a F a,x" dx, 


[faz 
H 6: 


in terms of elementary functions have failed; in the nineteenth century 
it was finally proved that it is actually impossible to carry out these 
integrations in terms of elementary functions. 

If therefore the object of the integral calculus were to integrate 
functions explicitly, we should have come to a definite halt. However, 
such a restricted objective has no intrinsic justification; it is of an 
artificial nature. We know that the integral of every continuous 
function exists as a limit and is itself a continuous function of the 
upper limit whether or not the integral can be expressed in terms of 
elementary functions. The distinguishing features of the elementary 
functions are based on the fact that their properties are easily recog- 
nized, that their application to numerical problems is facilitated by 
convenient tables, or that they can easily be calculated with as great a 
degree of accuracy as we please. 

Whenever the integral of a function cannot be expressed by means of 
functions with which we are already acquainted, there is no objection 
to introducing this integral as a new “higher” function, which really 
means no more than giving the integral a name. Whether the intro- 
duction of such a new function is convenient depends on the properties 
which it possesses, the frequency with which it occurs, and the ease 
with which it can be manipulated in theory and in practice. In this 


or 
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sense the process of integration is a general principle for the generation 
of new functions. 

We are already acquainted with this principle from our dealings with 
the elementary functions. Thus we were forced (p. 145) to introduce the 
integral of 1/x as a new function, which we called the logarithm and 
whose properties we could easily derive. We could have introduced the 
trigonometric functions in a similar way, making use only of the 
rational functions, the process of integration, and the process of 
inversion. For this purpose we need only take one or other of the 


equations 
f dt 
arc tan r = 


ol+f 


Or 


. is dt 
arc sin t = T= 
o yl- t 


as the definition of the function arc tan x or arc sin x respectively, and 
then obtain the trigonometric functions by inversion. By this process 
the definition of these functions is divorced from intuitive geometry, 
(in particular, from the intuitive notion of “angle”), but we are left 
with the task of developing their properties, independently of geom- 
etry.’ (Later, in Section 3.16 we shall give another purely analytic 
discussion of the trigonometric functions.) 


* Elliptic Integrals 


The first important example which leads beyond the set of elementary 
functions is given by the elliptic integrals. These are integrals in which 
the integrand depends rationally on the square root of a polynomial 
of third or fourth degree. Among these integrals the function 


8 dx 
u(s) -| Jaa Pa FV Be) 


has become particularly important. Its inverse function s(u) similarly 
plays an important role.” This function s(u) has been as thoroughly 
examined and tabulated as the elementary functions.® 


1 We shall not go into the development of these ideas here. The essential step is to 
prove the addition theorems for the inverse functions, that is, for the sine and the 
tangent. 

2 For the special value k = 0 we obtain u(s) = arc sin x and s(u) = sin u respectively. 
3 The function s(u), one of the so-called Jacobian elliptic functions, is usually denoted 
by the symbol sn u to indicate that it is a generalization of the ordinary sine-function. 
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It is the prototype of the so-called elliptic functions which occupy a 
central position in the theory of functions of a complex variable and 
occur in many physical applications (for example, in connection with the 
motion of a simple pendulum; see p. 410). 

The name “elliptic integral” arises from the fact that such integrals 
enter into the problem of determining the length of an arc of an ellipse 
(cf. Chapter 4, p. 378). 


We point out further that integrals which at first glance have quite a dif- 
ferent appearance turn out to be elliptic integrals after a simple substitution. 
As an example, the integral 


Í dx 
Vcosa — cos z 


is transformed by means of the substitution u = cos x/2 into the integral 


kv3| ee, hee 
val — uw)(1 — ku?) i ~ cos (a/2) ° 
the integral 
Í dx 
V cos 2x 


by means of the substitution u = sin x becomes 


f du l 
vV (1 — w)(1 — 2u?) i 


Í dx 
Vi — k? sin? x£ 


is transformed by the substitution u = sin x into 


and finally the integral 


Í du 
va =u — ku?) 
b. On Differentiation and Integration 


Another remark on the relation between differentiation and inte- 
gration should be inserted. Differentiation may be considered a more 
elementary process than integration, because it does not lead us out of 
the domain of “known” functions. On the other hand, we must 
remember that the differentiability of an arbitrary continuous function 
is by no means a foregone conclusion but a stringent assumption. In 
fact, as we have seen, there are continuous functions which are not 
differentiable at certain isolated points, whereas since Weierstrass’ 
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time many examples of continuous functions have been constructed 
which do not possess a derivative anywhere.’ In contrast, even though 
integration in terms of elementary functions is generally not possible, 
we are certain at least that the integral of a continuous function exists. 

Taken all in all, integration and differentiation cannot be contrasted 
simply as more elementary and less elementary operations; from some 
points of view the former and from other points of view the latter could 
be thought of as more elementary. 

Insofar as the concept of integral is concerned, we shall free ourselves 
in the next section from the assumption that the integrand is everywhere 
continuous; we shall see that it may be extended to wide classes of 
functions which have discontinuities. 


3.15 Extension of the Concept of Integral 


a. Introduction. Definition of “Improper” Integrals 


b 
In Chapter 2, p. 128, we defined | f(x) dx by forming the “Riemann 
sums” £ 


F, = È J(€) Az; 
i=l 
based on a subdivision of the interval [a,b] into n subintervals of 
lengths Az, and a choice of intermediate points ¢; in those subintervals. 
If the sequences F, tend to the same limit F,’, for any sequence of sub- 
divisions and intermediate points, as long as the largest value Az, tends 


b 
to zero, we define | f(x) dx to be that limit F. This limit was shown to 


exist when f(x) is continuous in [a, b]. However, we are often con- 
fronted with the need for defining an integral when f(z) is not defined, 
or not continuous, in all points of the closed interval / or when the 
interval of integration extends to infinity. We would wish, for example, 
to attach an appropriate meaning to expressions such as 


E bod 
[a or [sin 4 ae, 
0 fx 0 xz 


ii “sina 
f e "dx or f dx, etc. 
0 t 


x 


1 Compare Titchmarsh, The Theory of Functions, Oxford, 1932, Sections 11.21 to 
11.23, pp. 350-354. 


302 The Techniques of Calculus Ch. 3 


We first of all extend the concept of the integral to functions that are 
continuous in the open interval (a, b) but are not necessarily defined or 
continuous at the endpoints a, b. For any numbers «, b witha < a < 
P < b the ordinary (“proper”) integral f(x) dx is then defined. If 


rein | or 


«>0 


exists when a <a, < f, < b and mg = a, lim f, = b, and if F is 
«—0 


independent of me particular choice ¢ of a, and $, we say that the im- 
proper integral f (x) dx converges and has the value F. 


a 
Sectionally Continuous Integrand. 1f, more generally, f (x) is defined 
and continuous in (a, b) with the possible exception of a finite number 
of intermediate points c), c.°**c, and f is continuous in each of the 


b 
open intervals (a, c1), (c1, C2), . . -, (Cp, b) we define | f(x) dx as the 


a 
sum of the improper integrals over the subintervals, provided each of 
those converges. 


b 
The improper integral | f(x)dx always converges when f is contin- 


uous and bounded in the open interval (a, b). For example, the integral 


| Bee. (te il 
f sin — dx = lim f sin — dx 
0 xr €e +0 € x 
converges. To prove this general statement we may assume, for brevity, 
that f is continuous at b, but not necessarily at a. Then by definition 


fe) dx = lim F(a), 


x-a 


where F(x) for a < « < b is defined as| f(x)dx. If M is an upper 


bound for |f| and «, a sequence tending to a, we have by the mean 
value theorem of integral calculus |F(«,) — F(a,)| < M |n — %n1; 
hence, by Cauchy’s convergence test lim F(«,,) exists. Since this is the 


n— 


case for any sequence a, converging to a, it follows that lim F(«) exists. 
a—>a 


As a matter of fact, when f is continuous and bounded in (a, b) we 
can assign to f any values at the endpoints a, b and also obtain | f(x) dx 


directly as a “proper” integral defined as the limit of Riemann sums. 
It is easily seen that fer continuous bounded f both definitions apply 
and lead to the same value, independently of the choice of f (a) and f(b). 
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The same is true more generally for bounded functions that are defined 
and continuous in (a, b) with the possible exception of a finite number 


b 
of points. In particular, f f(x)dx always exists when f is continuous 
a 


except for a finite number of jump discontinuities. Altogether con- 


| | | 
Po 
a 
| 
— 
| fo 


Figure 3.30 Integral of a function with discontinuities. 


vergence of the improper integral of a function over a finite interval 
demands attention only when f becomes infinite. 

We note that the geometrical interpretation of the integral as the 
area under the curve is unchanged from the interpretation for a con- 
tinuous f (Fig. 3.30). 


b. Functions with Infinite Discontinuities 


i 
= oO? 
9 £ 


where « is a positive number. The integrand 1/x* becomes infinite for x — 0. 
We therefore must define J by taking the integral J, from the positive limit € 


We begin with the integral 


wy 
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to the limit 1, and finally letting « tend to zero. According to the elementary 
rules of integration, we obtain, provided a # 1, 


We immediately recognize the following possibilities: (1) « is greater than 1; 
then for e — O the right-hand side tends to infinity. (2) æ is less than 1; 
then the right-hand side tends to the limit 1/(1 — a). In the second case, 
therefore, we shall simply have to take this limiting value as the integral 


l dx 
J = Í mA In the first case we shall say that the integral from 0 to 1 does 
Gott 


not exist or diverges. (3) In the third case, where « = 1, the integral is 
equal to —log « and therefore for e — 0 does not approach a limit, but tends 


1 dx 
to infinity; that is, the integral a J does not exist or is divergent. 
o 7 


Another example for an integrand with an infinite discontinuity is given by 


fe) = I/V 1 — x. We find 


l—e€ dx 
| —— = arcsin (l — e). 
J0 


Vi — 2? 
For e — 0, the right-hand side converges to the limit, 7/2; this therefore is 


the value of the integral 
7 Í dx 
2 ov] T 


although the integrand becomes infinite at the point x = 1. 


c. Interpretation as Areas 


Improper integrals can be interpreted as areas of regions extending to 
infinity defined by means of a passage to the limit from bounded regions. 
For example, the preceding results for the function 1/x* assert that the area 
bounded by the x-axis, the line x = 1, the line x = e, and the curve y = 1/x* 
tends to a finite limit as € — 0, provided that « < 1, and that it tends to 
infinity if x > 1. This fact may be simply expressed as follows: The area 
between the x-axis, the y-axis, the curve y = I/x%, and the line x = 1 is finite 
or infinite according as «x <lora 2 1. 

Intuition can, of course, give us no reliable information about the finiteness 
or infiniteness of the area of a region stretching to infinity. Figure 3.31 
illustrates the fact that for « < 1 the area under our curve remains finite, 
whereas for « > | it is infinite, a fact which is certainly not suggested by 
geometrical intuition. 
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y 


O 


Figure 3.31 To illustrate the convergence or divergence of improper integrals. 


d. Tests for Convergence 


To check the convergence of an integral of a function f(x) with an infinite 
discontinuity at the point x = b we can often use the following criterion. 


Let the function f(x) be continuous in the interval a < x < b, and let 
b 


lim f(x) = œ. Then the integral Í f(x) dv converges if there exist both a 


r—b a 
ositive number u less than | and a fixed number M independent of :r, such 
P P 


that everywhere in the intervala < x < b the inequality | f (x)| < M/(6 — x)" 
is true; in other words, if at the point x = b the function f(x) becomes infinite 
of a lower order than the first: f(x) = O[1/(b — x)"] for some « < 1. On the 
other hand, the integral diverges if there exist both a number v > l and a fixed 
number N, such that everywhere in the interval a < x < b the inequality 
f(x) > N/((b — x) is true; in other words, if at the point x = b the positive 
function f(x) becomes infinite of the first order at least. 

The proof follows almost immediately by comparison with the very simple 
special case just discussed. In order to prove the first part of the theorem we 
observe that for 0O < « < b — a we have 


M 
+G) <i 


2 
0< OEN, 


M 
(b =a)" 
and hence also 
b--€ 2M 


b—e M 
o<f poles] TE 


e 


Ase — 0 the integral on the right, which is obtained from the integral f dx/x” 
by a simple substitution of b — x for x, has a limit and therefore stavs 
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bounded. Moreover, the values of the integral in the middle increase 
monotonically as «e — 0; since they are also bounded, they must possess a 
limit and the integral 


b M 
f ken ET dx +00) dx 


l b--€ M b—e 
= tim Í Goa + : fic) de) 


e—0 a 


converges. The convergence of the integral of M/(b — x)” then also implies 
b 
that of Í fœ) de. 


a 

The proof of the second part of the theorem is left as an exercise for the 
reader. 

We likewise see at once that exactly analogous theorems hold where the 
lower boundary of the integral is a point of infinite discontinuity. If a point 
of infinite discontinuity lies in the interior of the interval of integration, we 
merely separate the interval into two subintervals by this point and then apply 
these considerations to each of these. 

As an example we consider the elliptic integral 

Í ; dx (2 <1) 
ee <1). 
o V(1 — 221 — kèr?) 
From the identity 1 — x? = (1 — x)(I + x) we see at once that as x — I the 
integrand becomes infinite only of order 4, from which it follows that the 
improper integral converges. (For k = 1 the integral diverges.) 


e. Infinite Interval of Integration 


Another important extension of the concept of integral concerns an 
infinite interval of integration. For a precise formulation, we introduce 
the following notation: If the integral 


f (x) dx, 


with a fixed, tends to a definite limit for A — œ, we define the integral 
of f(x) over the infinite interval x > a, as 


lim [ votre f i; f(a) dz. 


AS 


Again, such an integral is called convergent. 


Examples. Simple examples of the various possibilities are again given by 
the functions f(x) = 1/x*, 


f dx 1 
— = — (A= — 1). 


q mni 
1 2 1 a 
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Here we see that, if we again exclude the case « = 1, the integral to infinity 
exists for the case « > 1, and, in fact, 


© dx 1 
i E 


when « < 1, the integral no longer exists. For the case « = 1 the integral 
again clearly fails to exist since log x tends to infinity as x does. We see 
therefore that with regard to integration over an infinite interval the functions 
1/x* do not behave in the same way as for integration up to the origin. This 
statement also is made plausible by a glance at Fig. 3.31. For obviously, the 
larger « is, the more closely do the curves draw towards the x-axis for x — œ; 
thus it is plausible that the area under consideration tends to a definite limit 
for sufficiently large values of a. 

The following criterion for the existence of an integral with an infinite limit 
is often useful. (We again assume that for sufficiently large values of x, say 
for x > a, the integrand is continuous.) 


Criterion of C onvergence 


The integral f f(x) dx converges if the function f(x) vanishes at infinity to 


a higher order than the first, that is, if there is a number v > 1 such that for all 
values of x, that are sufficiently large, the relation | f(2)| < Mx” is true, 

l 
where M is a fixed number independent of x. In symbols: f(x) = o(Ż). 
Again, the integral diverges if the function remains positive and vanishes at 
infinity to an order not higher than the first, that is, if there is a fixed number 
N > 0 such that xf (x) > N. 


The proof of these criteria is exactly parallel to the previous argument and 
can be left to the reader. o] 
A very simple example is the integral | at (a > 0). The integrand 
a 
vanishes at infinity to the second order. We see at once that the integral 
A] 1 | 
converges, for | — dx = - — —, and therefore 
£ a A 


a 


Another equally simple example is 


NIJ 


ae | 
—— dx = lim (arc tan A — arc tan 0) = 
o ee 


A+ 0 


Then obviously also 
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since the integrand is an even function. It is curious that the area between the 
curve and y = 1/(1 + x?) and the x-axis (see Fig. 3.8, p. 216) that extends to 
infinity turns out to be the same as that of a circle of radius one. 


f- The Gamma Function 


A further example of particular importance in analysis is that of 
the so-called gamma function 


T(n) =| Tere dx (n > 0). 
0 


Splitting up the interval of integration into one part from z = 0 to 
x = ] and another one from x = | to x = œ, we see that the integral 
over the first part clearly converges, since 0 < e-*x""! < 1/2" with 
=1—n<1l. For the integral over the second, infinite part, the 
criterion of convergence is also satisfied; for example, for v = 2, we 
have lim x?e~*x"~1 = Q, since the exponential function e~* tends to zero 


to a higher order than any power 1/x™ (m > 0) (see p. 253). This 
gamma function which we consider as a function of the number n 
(not necessarily an integer) satisfies a remarkable relation obtained by 
integration by parts as follows. First, we have (with f(x) = x", 


g'(x) = e°) 
[ere dx = =e"; + (n a fer dz. 


If we take this integral relation between 0 and A and then let A increase 
beyond all bounds, we immediately obtain 


T(n) =(n — p| eee dx = (n — 1)I'(n — 1) for n>], 


and by this recurrence formula, provided u is an integer and O < u < n, 
it follows that 


Pen) = (n — a = D = wy | “ete ae 
0 
In particular, if n is a positive integer, we have for u = n — | 


Pon) = (n= (n= 2-03-21 | e dz, 
0 


et dx = 1, 
0 


I(n) = (n — 1)(n — 2)::2-1=(n— DI, 


a most useful expression of a factorial by an integral. 


and since 


we have 
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Other Examples. The integrals 


ss fora) ra] 
_ 72 n2 
e? dr, Í xe% dx 
0 0 


also converge, as we may easily deduce from our criterion. The first one is 
identical with 41(4), the second one with 4 [(n + 1)/2] forn > —}, as is seen 


by the substitution x? = u, dr = (1/2 Vu) du. 


g. The Dirichlet Integral 


In many applications we encounter integrals whose convergence does not 
follow directly from our criterion. An important example is furnished by the 


integral 
“sina 
[= dx 
0 x 


investigated by Dirichlet. If the upper limit is not infinite but finite, the 
integral is convergent since the function (sin x)/ is continuous for all finite x; 


sin x 
for x = O it is given by lim en 1 for x + o). The convergence of the 


integral J is due to the periodic change in sign of the integrand, which causes 
contributions to the integral from neighboring intervals of length 7 almost to 
cancel one another (Fig. 3.32). Thus the sum of the infinitely many areas 


Yi >, 


sin xz 


Figure 3.32 Graph of y = ae 
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sin x 
between the x-axis and the curve y = —— converges, if we count areas above 
x 


the x-axis as positive and below the z-axis as negative. (On the other hand. 
the sum of the numerical values of all areas, that is, the integral, 


im isin x| i 
x, 
an: 


can easily be shown to diverge.) 
The alternating character of the function sin x accounts for the fact that 
its indefinite integral 


[sin xdr =1—cosz 


is bounded for all x. We make use of this fact in estimating the expression 


B sin x Bl d(1 — cos x) 
lag = dix = SS dr 
i eas ae dx 


Integration by parts shows that 


r 1 — cos B LA, B] —cosx 
= —————— — — — ~ dz. 
Hence 
© sin x °1—cosr 
mdr = lim Igy =] 3 4, 
0 m A—>0 0 i 
B >œ 


where the integral on the right-hand side clearly is convergent. In other 
words, the integral / exists. In Section 8.4c we shall establish further the 
remarkable fact that / has the value 7/2. 


h. Substitution. Fresnel Integrals 


Obviously, all rules for the substitution of new variables, etc., 
remain valid for convergent improper integrals. Often such trans- 
formations can lead to different, more tractable expressions for the 
integral. 

As an example, to calculate 


oe 2 
xe * dx 
0 
we introduce the new variable u = x? and obtain 


[xe dx = | e~“ du = lim tä — e^) = 
0 2 Jo 


1 
á> 2 Je 
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Another example in the investigation of improper integrals is given 
by the Fresnel integrals, which occur in the theory of diffraction of 
light: 


ma J “sin (2°) dz, F, = Í “cos (2) dz. 
0 0 


The substitution z? = u yields 


Integrating by parts, we find 


du. 


A Ju = JB VA 2 


As A and B tend to zero and infinity respectively, we see by the same 
argument as for the Dirichlet integral that the integral F, converges. 
The convergence of the integral F, is proved in exactly the same way. 

These Fresnel integrals show that an improper integral may exist 
even if the integrand does not tend to zero as x-» œ. In fact, an 
improper integral can exist even when the integrand is unbounded, as 
is shown by the example 


” sinu | 1—cosB 1—cosA s f L=gosu 


35 
A u” 


| Diveos (u*) du. 
0 


When u’ = nr, that is, when u = Wn, n=0,1,2,... the integrand 
becomes 2/nz cos nt = +2Wnz, so that the integrand is unbounded. 
By the substitution u? = x, however, the integral is reduced to 


Í “cos (x°) dx, 
0 


which we have just shown to be convergent. 
By means of a substitution an improper integral may often be 
transformed into a proper one. For example, the transformation 


x = sin u gives 
1 7/2 
Í -| Iree 
o V1- 2 0 2 


On the other hand, integrals of continuous functions may be trans- 
formed into improper integrals; this occurs if the transformation 
u = ¢(x) is such that at the end of the interval of integration the 
derivative ġ'(x) vanishes, so that dx/du is infinite. 
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3.16 The Differential Equations of the Trigonometric Functions 
a. Introductory Remarks on Differential Equations 


Integration ts merely the first step into a much more extensive 
field: Instead of inverting differentiation by integration, that is of 
solving the equation y’ = f(x) with given f(x) for y = F(x), we might 
aim at finding functions y = F(x) which satisfy more general relation- 
ships between y and derivatives of y. Such “differential equations” 
occur everywhere in applications as well as in strictly theoretical 
contexts. Penetrating studies far beyond the framework of this book 
are made of these equations: we shall return to some elementary 
aspects of the theory of differential equations later in this and the 
following volume. At this stage we confine ourselves to a quite simple, 
yet significant, example. We shall discuss the differential equations of 
the functions sin x and cos x, which we have already mentioned on 
p. 171. 

Although in elementary trigonometry these functions and their 
properties were taken from a geometric standpoint, we now discard the 
reliance on geometric intuition and put the trigonometric functions in a 
simple way on a precise, analytical basis, in accordance with the general 
trend of development mentioned before. 


b. Sin x and cos x Defined by a Differential Equation 
and Initial Conditions 


We consider the differential equation 
u” +u=0 


with the aim of characterizing solutions u(x) which we shall identify 
with the sine and cosine functions. Any function u = F(x) satisfying 
the equation, that is for which F’(x) + F(x) = 0, is called a solution.’ 

At once we realize that together with a solution u = F(x) the 
function u = F(x + h) for arbitrary constant A is also a solution, as 
immediately verified by differentiating F(x + h) twice with respect to z. 
Similarly, it is immediately seen that with F(x) the derivative F’(z) = u 
is also a solution, as is of course, cF(x) with a constant factor c. In 
addition, together with F,(x) and F,(x) any linear combination c, F(z) + 
coF,(x) = F(x) with constants c, and c, is a solution. 


1 Of course, it is always understood that the functions under consideration are 
sufficiently differentiable. 
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To single out from the multitude of solutions of the differential 
equatidn a specific one, we impose “initial conditions” stipulating that 
for x = 0 the values of u = F(0) and u’ = F’(0) be prescribed as a and b 
respectively. We state first: 

The- solution is uniquely determined by these initial values. 

For the proof we start with a general remark valid for any solution u. 
By multiplying the differential equation with 2u’ we find because of 
2u"u’ = (u?) and 2u’u = (u?)’ the equation 


0 = 2u’u’ + 2u’u = [(u’)? + uv], 
which can be integrated at once and implies 
u*+y2=¢, 


where c is a constant, that is, does not depend on x; therefore c must 
have the same value as the left-hand side for x = 0. Thus we have for 
any solution u 

u'*(0) + u2(0) = c. 

Now, suppose we have two solutions u, and u, with the same initial 
conditions: Then the difference z = u, — u, is a solution with z'(0) = 
z(0) = 0. Hence we have c = 0 and for all z z? + 2? = 0; this means 
that z = 0 and z’ = 0 which obviously proves our statement. 

We now define the functions sin x and cos x as those solutions of the 
differential equation u"(x) + u(x) = 0 for which the initial conditions 
are, respectively, for u = sin z, 


u(0) =a = 0, u(0)=b=1, 
and for u = cos 2, 
u(0)=a=l1, u'(0) = b=0. 


We take for granted here the fact that such solutions exist and are 
arbitrarily often differentiable, since its proof will be given later anyway 
in a more general context (see Section 9.2)." 

The only solution u of u” + u = 0 for which u = a, u' = b for x = 0 
is then the function u = a cosx + bsinz. This proves that every 
solution of the differential equation is a linear combination of cos x 
and sin x. 

Now we obtain the basic properties of the trigonometric functions 
from our differential equation u” + u = 0 applied, for example, to the 


1 Incidentally, we can infer these facts immediately from the equation u”? + uv? = 1, 
which is valid for sin x as well as for cos x and from whose equivalent form dz/du = 


1/V | — u? the inverse functions of sin x and cosx are immediately obtained by 
integrations. 
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function u = sinx. Obviously, with u also v =u’ is a solution: 
v” +v=0. Because of u“+u=v'+u=0 we have v'(0)= 
—u(0) = 0 whereas v(0) = u'(0) = 1. Hence 


d ., 
v(x) = cos x = — sin z. 
dx 


Similarly, we derive (d/dx) cos x = —sin x. 
The central theorem of trigonometry is the addition theorem 


cos (r + y) = cos x cos y — sin z sin y. 


It now follows immediately from our approach: First, the function 
cos (x + y) as a function of x, with y remaining constant for the 
moment, is a solution u(x) of the differential equation u” + u = 0 
satisfying for x = 0 the initial conditions u(0) = cos y = a and u'(0) = 
—sin y = b. Now, as verified immediately the solution—according 
to the preceding statement, the only one—for which u(0) =a and 
u'(0) = bisa cos x + b sin z. Hence we have at once for our solution 
cos (x + y) the expression 


cos (x + y) = cos x cos y — sin z sin y, 


as we wanted to prove. 

The remarks in this section should suffice to indicate how trigono- 
metric functions can be introduced in an entirely analytical manner 
without any reference to geometry. 

Without going into further details we mention the following. 


The number 47 could now be defined as the smallest positive value 
of x for which cos z = 0. 

The periodicity of the trigonometric functions likewise follows 
easily from the analytic approach. 


We shall return to the analytical construction of the trigonometric 
functions by infinite power series (see Section 5.56). 


PROBLEMS 


SECTION 3.1, page 201 


1. Let P(x) = ay + a£ + agz® + +++ + aya”. 
(a) Calculate the polynomial F(x) from the equation 


F(x) — F(a) = P(x). 
*(6) Calculate F(x) from the equation 
CoF (8) + OF (2) + F(x) = P). 
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2. Find the limit as n — œ of the absolute value of the nth derivative of 
1/x at the point x = 2. 


3. Prove if f(x) = 0 for all x, then f is a polynomial of degree at most 
n — 1, and conversely. 


4. Determine the form of a rational function r for which 


5. Prove by induction that the nth derivative of a product may be found 
according to the following rule (Leibnitz’s rule): 


d” _ „dg | (n\dfd"—g A a 
dai IB) SS pe t ET = (5 dx? dr"? 

dr- nf 

+( ý ) [E 2 


n — 1) dx™ dx dar?’ 


i 


— |] . : 
Here (") = n, (") = wwe) , etc.; denote binomial coefficients. 


2 2! 
n—i = PN prnl l 
6. Prove that > fed irl Saba sea 
i=] (x — 1) 


SECTION 3.2, page 206 

1. Let y = e”(a sinx + b cosx). Show that y” can be expressed as a 
linear combination of y and y’, that is, 

y” = py’ + qy, 

where p and g are constants. Express all higher derivatives as linear com- 
binations of y’ and y. 

*2. Find the nth derivative of arc sin x at x = 0, and then of (arc sin x}? 
atx = 0. 
SECTION 3.3, page 217 

1. Find the second derivative of f[g{A(2)}]. 

2. Differentiate the following function: log,,,, u(x), [that is, the logarithm 
of u(x) to the base v(x); v(x) > 0). 

3. What conditions must the coefficients «, f}, a, b, c satisfy in order that 

ar + p 
vV (ar? + 2bx + c) 

shall everywhere have a finite derivative that is never zero? 

4. Show that d"(e**/?)/da” = u,(x)e*"/*, where u,() is a polynomial of 
degree n. Establish the recurrence relation 

Unyi = Tün + Un’. 
*5. By applying Leibnitz’s rule to 


x 
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obtain the recurrence relation 
Uny = Tüp + Nupa. 
*6. By combining the recurrence relations of Problems 4 and 5, obtain 
the differential equation 
Up + vu,’ — nu, =0 
satisfied by u,(2). 
7. Find the polynomial solution 
Ulz) = x" + ax) +--+ +a, 


of the differential equation uw,” + xu,’ — nu, = 0. 


*8. If P,(x) = (x? — 1)”, prove the relations 


2"n! dx” 


; x — | (n + 2)r n+2 
Pos te Poa ep a 
(a) Pass Xn +1) ” n+l ” 2 


(b) Pazi = UP, t(n + 1)P,. 


P,. 


(c) cs [(z? — 1)P,’] — n(n + 1)P, = 0. 
dx 


9. Find the polynomial solution 
(2n)! 


= yr ant sys 
n 2"(n!)? + ai + F An 


of the differential equation 


d 
dx [(2? = 1)P,,] T n(n + 1)P,, = 0. 
. ; l d” : 
10. Determine the polynomial P,(z) = iaa (r? — 1)" by using the 
binomial theorem. near 


*11. Let 4, (X) = (£) (lL — x)’, n = 0, 1,2,...,p. Show that 


p 
| = 5 An p(x). 
n-=0 


SECTION 3.4, page 223 
1. The function f(x) satisfies the equation 
fæ +y =f@fly). 


(a) If f (x) is differentiable, either f(x) = 0 or f(x) = e”. 
*(b) If f (x) is continuous, either f(z) = 0 or f (x) = e%. 
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2. If a differentiable function f (x) satisfies the equation 


fry) =f) +f, 
then f(r) = a log x. 


3. Prove that if f(x) is continuous and 


x 
f@) -| f(t) dt, 
0 
then f(x) is identically zero. 
SECTION 3.5, page 228 


1. Prove the formula 


TEE S EE ( 5 s) cosh £ 5 *) 


Obtain similar formulas for sinha — sinh b, cosha + cosh b, cosha — 
cosh b. 

2. Express tanh (a + b) in terms of tanh a and tanh b. 

Express coth (a + b) in terms of coth a and coth b. 

Express sinh 4a and cosh $a in terms of cosh a. 

3. Differentiate 

(a) cosh x + sinh x; (b) e't»? 4 cothe 

(c) logsinh (x + cosh? x); (d) arcoshx + arsinhx (e) ar sinh (x cosh x); 

(f) ar tanh (2z/(1 + x°)). 

4. Calculate the area bounded by the catenary y = cosh z, the ordinates 
x = a and x = b, and the z-axis. 


SECTION 3.6, page 236 


1. Determine the maxima, minima, and points of inflection of x3 + 3pr+ 
q. Discuss the nature of the roots of x? + 3px +q = 0. 

2. Given the parabola y? = 2px, p > 0, and a point P(x = $, y = n) 
within it (n? < 2pé), find the shortest path (consisting of two line segments) 
leading from P to a point Q on the parabola and then to the focus F(x = 
$p, y = 0) of the parabola. Show that the angle FQP is bisected by the 
normal to the parabola, and that QP is parallel to the axis of the parabola 
(principle of the parabolic mirror). 

3. Among all triangles with given base and given vertical angle, the isosceles 
triangle has the maximum area. 

4. Among all triangles with given base and given area, the isosceles tri- 
angle has the maximum vertical angle. 

*5. Among all triangles with given area, the equilateral triangle has the 
least perimeter. 

*6. Among all triangles with given perimeter the equilateral triangle has 
the maximum area. 

*7. Among all triangles inscribed in a circle the equilateral triangle has 
the maximum area. 


8. Prove that if p > 1 and x > 0, x? — 1 2 p(x — 1). 
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9. Prove the inequality 1 > (sin x)/x > 2/7,0 <x < 2/2. 

10. Prove that (a) tan x > 2,0 < 7/2. 

(b) cosx > 1 — x?/2. 

*11. Given a, > 0, a, >0,...,a, > 0, determine the minimum of 


a + etana Ha 
n 
WEAR AET 
for x >0. Use the result to prove by mathematical induction that (cf. 
Problem 13, p. 109) 
at +a, 


V daoti an S 
1⁄2 n n 


n 
12. (a) Given n fixed numbers a}, . . . , an, determine x so that 5 (a; — x)? 
is a minimum. i=1 


nN 
*(b) Minimize > |a; — x]. 
i=l 


*(c) Minimize 5 A, la; — x|, where å; > 0. 
i=1 


13. Sketch the graph of the function 
y = °F, yO) = 1. 


Show that the function is continuous at x = 0. Has the function maxima, 
minima, or points of inflection ? 


*14. Find the least value « such that 


1 eta 
(i +- >e 
x 


for all positive x. (Hint: It is known that [1 + (1/)}"*! decreases mono- 
tonically and [1 + (1/x)}* increases monotonically to the limit e at infinity.) 


*15. (a) Find the point such that the sum of the distances to the three 
sides of a triangle is a minimum. 

(b) Find the point for which the sum of the distances to the vertices is a 
minimurn. 


16. Prove the following inequalities: 

(a) e° > I(l +2), x >0. 

(b) e > 1 +log(1 +x), x > 0. 

(c) e >1+(1 +2) log(l +2), 2 > 0. 

17. Suppose f(x) < 0 on (a, b). Prove: 

(a) Every arc of the graph within the interval lies above the chord joining 
its endpoints. 

(b) The graph lies below the tangent at any point within (a, b). 


*18. Let f be a function possessing a second derivative on (a, b). 
(a) Show that either condition a or b of Problem 22 is sufficient for 
fœ) <0. 
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(b) Show that the condition 


tty) © f(z) +fH 
hs | Te 
for all x and y in (a, b) is sufficient for f”(x) < 0. 
*19. Let a, b be two positive numbers, p and g any nonzero numbers 
p <q. Prove that 
[0a” + (1 — 6)b7}/” 
[0a + (1 — 6)b%}1/2 
for all values of 6 in the intervalO < 0 < 1. 


(This is Jensen’s inequality, which states that the pth power mean [6a” + 
(1 — 6)b”]'’? of two positive qualities a, b is an increasing function of p.) 


20. Show that the equality sign in the above inequality holds if, and only 
if, a = b. 


21. Prove that lim [6a”? + (1 — 6)b?})/” = a®p)-®, 


+0 
22. Defining the zeroth power mean of a, b as a°b’~*, show that Jensen's 
inequality applies to this case, and becomes (a # b), 
a®b® = [0a + (1 — 6)b*}}/" according to whether g s 0 
Forg = 1, ab? < Oa + (1 — 0)b. 
23. Prove the inequality 
a®b'-® < da + (1 — 0)b, 


a,b > 0,0 < 6 < 1, without reference to Jensen’s inequality, and show that 
equality holds only if a = b. (This inequality states that the 6, 1 — 0 geo- 
metric mean is less than the corresponding arithmetic mean.) 


*24. Let f be continuous and positive on [a, b] and let M denote its maxi- 


mum value. Prove 
n b 
M = lim fl [f(x)]” dex. 
N-> © a 


SECTION 3.7, page 248 
1. Let f(x) be a continuous function vanishing, together with its first 
derivative, for x = 0. Show that f(x) vanishes to a higher order than x as 
x — 0., 
ag” +a,2"14+--- +4, 
when ap, by # 0, is of the same order of magnitude as x”~™, when x — œ. 
*3. Prove that e” is not a rational function. 
*4. Prove that e” cannot satisfy an algebraic equation with polynomials 
in x as coefficients. 
5. If the order of magnitude of the positive function f(x) as x > œ is 
xr 


2. Show that f(z) = 


higher, the same, or lower than that of x”, prove that | f(&) dé has the 
corresponding order of magnitude relative to «™*?. a 
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zr 
6. Compare the order of magnitude as x + œ of f f(&) dë relative to 
f(x) for the following funcuons f(x): a 


eva 
Vr 
(b) e. (d) log zx. 
SECTION 3.8, page 263 
1. Find the limit as n — œ of a, = 
*2. Find the limit of 


1 1 1 1 
= + ——— + —— +; 4 mm 
Vre —-0 Vr-—-]I1 Vnt —4 Vn? — (n — 1) 


(a) (c) xe”, 


arl ee oe 


by 


*3. If «æ is any real number greater than —1, evaluate 
TE cee ae ee a 
lim ae) aa . 


n—> @ 


SECTION 3.11, page 274 


1. Show that for all odd positive values of n the integral f e~*°2” dx can be 
evaluated in terms of elementary functions. 


2. Show that if n is even, the integral f e **x" dx can be evaluated in terms 
of elementary functions and the integral f e~** dx (for which tables have been 


constructed). 
ii [ro a du =| puree — u) du. 
0 0 0 


3. Prove that 
*4. Problem 3 gives a formula for the second iterated integral. Prove that 
the nth iterated integral of f (x) is given by 


] g£ 
A n—1 
= f(uy@ = u) du. 

5, Prove for the binomial coefficient (") that 

n 1 --] 

( ) = Ç + | x1 — x)” as] : 

k 0 

6. Obtain a recursive formula for 
ferae + b)? dx 
and use this relation to integrate 


fae + 1} dz. 
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d” 
—— (x? — 1)”. Show that 


bay (a) Let P,,(2) = amy! ax” 


1 
f P,(X) P(X) dx = 0, if mn. 
2A 
1 


(6) Prove that [Paco dx = F a = 


(c) Prove that f x™P,(x)dx = 0, ifm <n. 
1 
1 
(d) Evaluate Í x” P, (x) dx. 


1 
SECTION 3.12, page 282 
*1. Integrate 


dr 
fa +10 


2. Use the partial fraction expansion to prove Newton's formulas 
a" ay * a, * {0 for k =0,1,2,...,”—-2 
Ae ee a ee S 

Bi (a) g(a) g (an) \1 for k=n-—1], 


where g(x) is a polynomial of the form x” + ayr”? +-+ with distinct 
TOOLS %),..., Ey. 


SECTION 3.14, page 298 


“1. Prove that the substitution x = (ar + B)/(yt + 6) with «að — y8 #0, 
transforms the integral 


dx 
into an integral of similar type, and that if the biquadratic 
axt + bx? + cx? + dx +e 


has no repeated factors, neither has the new biquadratic in ¢ which takes its 
place. Prove that the same is true for 


[Re Vart + bx? + cx? + dr + e) dx, 


where R is a rational function. 
2. The function 


(2) f e 
z) = = 
0 V1 = k? sin? u 

is known as the elliptic integral of the first kind. 

(a) Show that ¢ is continuous and increasing and hence has a continuous 
inverse. 

(b) Let am(x) denote the inverse of ¢(z). Prove sn(x) = sin [am(x)], where 
sn(x) is defined on p. 299, footnote 3. 
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SECTION 3.15, page 301 


o0 , 1 
*1. Prove that | sin? [> (= + ‘) dx does not exist. 
0 


dx 


[e 0) 
*2, i —— = 
Prove that lim f iF kr 


k-—+® 


3. For what values of s is ol = ~ dr, of 2 —— ~ dx convergent ? 


sin ¢ 
*4. Does —— dt converge? 
f T+? 8 


*5. (a) If a is a fixed positive number, prove that 


a 
-am Í — dx = r. 
h-+0+ J—-a 


(b) If f(x) is continuous in the interval —1 < x < 1, prove that 


1 
' h 
—1 


h—0 


z 
*6. Prove that lim | e? dt =0. 
0 


gz— 0O 


7. Assuming that |«| # ||, prove that 


lim ae sin ax sin fx dx = 0. 


T= œ 
*8. If Í n dx converges for any positive value of a, and if f (x) tends to a 
T 


” f (ax) — f(px) 
x 


limit L as x — 0, show that | f dx converges for « and f 
0 


positive and has the value L log ia 


9. By reference to the Problem 8, show that 
I= oe = log Ê. 


x 


“10. If ap fe dx converges for any positive values of a and b, and if f(z) 


a 
tends to a limit M as v — œ and a limit L as x — 0, show that 


Í a ED =(L S M)log ©. 
0 £ x 
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11. Obtain the following expressions for the gamma function: 


Tn) = 2 gen] ee? dx, 
0 


1 1 n~-1 
T(n) = ll (iog 1) dx. 
0 x 


SECTION 3.16, page 312 
1. Obtain the addition formula for sin (£ + y). 


2. Without using the addition formulas prove that cos x is an even function 
and sin x, odd. 


3. (a)* Prove for some positive h that cosa <1 for0 < x < h. 
(b) Prove ifcosz > OforO0 <z < 2"xthat 
cos (2”Hlr) < 2" (cosa — 1) + 1. 


(c) Combining the results (a) and (b) prove that cos x has a zero. 
4. Let a be the smallest positive zero of cos. Prove that 


sin (« + 4a) = sina, 
cos (a + 4a) = cos zr. 


5. Fill in the steps of the following indirect proof that cos x has a zero: 
(a) If cos x has no zeros, then sin x is monotonically increasing for x > 0. 
(b) The functions sin x and cos x are bounded from above and below. 
(c) The limit of sin x as x tends to infinity exists and is positive. 
(d) The equation 

qT 

cosx = Í -Í sin ¢ dt 
0 
stands in contradiction to (b). 


MISCELLANEOUS PROBLEMS 


1. Prove 


n 


d atid od A ee 


\ 


when ¢ = log x. Here, we employ 


d = do B 


where ¢ is any function of ż and k is a constant. 

2. A smooth closed curve C is said to be convex if it lies wholly to one side 
of each tangent. Show that for the triangle of minimum area circumscribed 
about Z that each side is tangent to C at its midpoint. 


4 


Applications in 
Physics and Geometry 


4.1 Theory of Plane Curves 


a. Parametric Representation 
Definition 


The representation of a curve by an equation y = f(x) imposes a 
serious geometrical restriction: A curve so represented must not be 
intersected at more than one point by any parallel to the y-axis. 
Usually, this restriction can be overcome by decomposing the curve 
into portions each representable in the form y = f(x). Thus a 
circle of radius a about the origin is given by the two functions 
y = vV&@ — z? and y = -Va — x* defined for —a < x < a. How- 
ever, for as simple a curve as a parallel to the y-axis this device does 
not work. 

More flexibility is obtained by an implicit representation through an 
equation ¢(x, y) = 0 which involves a function ¢ of two independent 
variables. For example, the circle of radius a about the origin is 
completely described by ¢(z, y) = x? + y? — a? = 0. Any straight line 
in the plane has an implicit equation of the form ax + by +c = 0, 
where a, b, c are constants and a and b do not both vanish; for b = 0 
we obtain a parallel to the y-axis. 

The implicit description of a curve has the disadvantage that to find 
points (x, y) of the curve at all, say for a given z, we must solve the 
equation ¢(z,y) =0. This problem we shall discuss in detail in 
Volume II. 
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The most direct and most flexible description of a curve is a para- 
metric representation. Instead of considering one of the rectangular 
coordinates y or x as a function of the other we think of both coordi- 
nates x and y as functions of a third independent variable t, a so-called 
parameter; the point with coordinates x and y then describes the 
curve as t traverses a corresponding interval. Such parametric repre- 
sentations have already been encountered; for example, the circle 
x? + y? =a? has the parametric representation z = a cost, y = 
a sin t. Here t denotes the angle at the center of the circle. 


Figure 4.1 


For the ellipse 2?/a® + y?/b? = 1 we have the similar parametric 
representation z = a cos t, y = b sin t, where z is the so-called eccentric 
angle, that is, the angle at the center corresponding to the point of the 
circumscribed circle lying vertically above or below the point P = 
(a cos t, b sin t) of the ellipse. We assume here that b < a (see Fig. 
4.1). In both cases the point with the coordinates v, y describes the 
complete circle or ellipse as the parameter ¢ traverses the interval 
0<t < 2rn. 

In general, curves C are parametrically represented by two functions 
of a parameter t, 


r= d=), y= yt) =O; 


1 This word denotes an auxiliary variable which we do not want to emphasize 
primarily. 
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the shorter notation z(t) and y(t) will be used when there is no danger 
of confusion.’ 

We assume throughout that ¢ and y possess continuous derivatives 
unless the contrary is said. 


Mapping of Parameter Interval on Curve—Sense of Direction 


For a given curve these two functions ¢(t) and y(t) must be deter- 
mined in such a way that the set of pairs of functional values x(t) and 
y(t) corresponding to a certain interval of values ¢ defines all the points 
on the curve and no other points. We have then a correspondence 
between the points of the curve and the values of ¢ in an interval of the 
t-axis. The parameter representation defines a mapping of the t-axis 
onto the curve, the original point ¢ on the t-axis being mapped onto the 
point x = g(t), y = y(t) of C. 

Since z(t) and y(t) are assumed continuous, neighboring points on the 
t-axis correspond to neighboring points on the curve. Since the points 
of the z-axis are ordered, we may in an obvious manner assign an 
order or “sense”’ to the points of C by saying that the point onto which 
the number ż, is mapped precedes the point onto which f, is mapped if 
ty < ty (see p. 334). The parametric representation thus gives precise 
meaning to the vague intuitive notion of a curve as a set of points in 
which the points are arranged in the same order as on a straight 
line. 


b. Change of Parameters 


The values of the parameter ¢ serve to distinguish the different 
points on the curve C; they play the role of “names” for the individual 
points of the curve. 

The same curve Cadmits of many different parameter representations. 
Any quantity that varies continuously along the curve and has different 
values in different points of the curve can serve as parameter. 

If, say, the curve originally is given by an equation y = f(x), we 
can choose for the parameter f the variable x and describe the curve by 
the functions x = t, y = f(t). Similarly, for a curve described by giving 
x as a function of y, say x = g(y), we can use y as parameter f and write 
x= g(t), y =t. 


1 The notation z = ¢(t), etc., puts emphasis on the specific functional connection 
between the dependent and independent variable; the notation x(t), etc., just 
means that ¢ is to be considered as the independent variable which determines the 
function value of x in some prescribed way. 
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For a curve given by an equation r = A(@) in polar coordinates r, 0 
(see Chapter 1, p. 101) we can choose 6 as parameter t and obtain 
the parametric representation 


x = r cos 0 = h(t) cos t = d(n), 
y = r sin 0 = h(t) sin t = y(t). 


From a given parametric representation x = ¢(t), y = y(t) of a 
curve C we can always derive many other parameter representations. 
For that purpose we take an arbitrary function 7 = y(t) which is 
monotonic and continuous in that f-interval corresponding to the 
points of C; the function y has then a monotone and continuous 
inverse t = o(7) in a corresponding 7-interval. The coordinates of the 
points (x, y) of C can then be represented in the form 


x = g[o(7)] = 2(7), y = plo(r)] = A(z). 


The functions a(r) and f(r) are again continuous; moreover, different 
points of C correspond to different values of ¢ and hence, because of 
the monotone character of the function ø, to different values of 7. The 
total effect of the change of parameter from f to 7 is that of “renaming” 
the points on C. 

Thus the line y = x has the parameter representation x = t, y = t, 
where —œ <¢ < œ. Substituting 7 = t? gives rise to the parameter 
representation x = 7’, y = 7? for the same line. 

Similarly, the ellipse x?/a? + y?/b? = 1 admits of the parameter 
representation x = a cos t, y = b sin t, where 0 < t < 27. Defining 
t = c% + d, for c, d real numbers (c # 0) yields another representation 
x(t) = a cos (c + d), y(C) = b sin (cl + d) for the same ellipse, with 
¢ varying in the interval —d/e < € < (27 —d)/c for c> 0, and 
(2r — dic < č < —d/c, for c <0. The substitution 7 = tan (t/2) 
leads to the “rational” parameter representation (see p. 292) 


alr) br 
ARDE I T+? 


> 


for the ellipse; «s t runs through all real values we obtain all points 
of the ellipse with the exception of the point S = (—a, 0). 

Singularities in ordinary representation may disappear if a suitable 
parameter is used. For example, we can represent the curve y = Wx? 
by the smooth functions x = f°, y = t?. The point with coordinates 
x, y then describes the whole curve (semicubical parabola) as t varies 
from — œ to +0. 
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This flexibility in the choice of the parameter often permits us to 
simplify the study of geometrical properties which, of course, do not 
depend on specific representations. 

In particular, we may sometimes find it convenient to use a repre- 
sentation y = f(x) for C or part of C. Such a representation is always 
possible for a portion of the curve tọ < t < t, in which one of the 
functions ¢, y, say x = (ft), is monotonic. Indeed for this portion we 
have a unique inverse function ¢ = y(x) and thus y = y[y(z)].! 


c. Motion along a Curve. Time as the 
Parameter. Example of the Cycloid 


Motion along a Curve 


Very often the parameter f¢ has the natural physical meaning of time. 
Any motion of a point in the plane may be expressed by representing 
its coordinates x and y as functions of the time such that at the time 1, 
the point (x, y) is at (z(t), y(t)). These two functions therefore deter- 
mine the motion along a path or trajectory C in parametric form; they 
constitute a mapping of the time scale onto the trajectory.” 


The Cycloids and Trochoids 


An example is furnished by the cycloids, the paths of points on a 
circle rolling uniformly without slipping along a straight line or another 
circle. In the simplest case a circle of radius a rolls along the z-axis; 
the path of a point P on its circumference is a “common” cycloid. We 
choose the origin of the coordinate system and the initial time in such 
a way that for time t = 0 the point P is at the origin and that at the 
time t the circle has turned from its original orientation by the angle t. 
This means that the circle turns clockwise with “angular velocity” one. 
The circle is assumed to roll uniformly along the z-axis without sliding 
so that at the time ż the distance of the point of contact from the origin 
is exactly equal to the length of the arc from the point of contact to P. 
Thus at the time ¢ the center M of the rolling circle must be at the point 
(at, a); the center moves with constant velocity a to the right. For the 


1 This is, of course, merely a statement about a property “‘in the small’ of a curve, 
meaning a statement made only for a suitably small portion. Usually (for example, 
in the case of a circle), the variable x cannot be used as a parameter throughout the 
whole curve but only on a portion. 

? To a change of the parameter ¢ there would correspond then a change in the time 
scale according to which the curve C is described by the moving point. 
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coordinates of P at the time ¢ we find then (see Fig. 4.2) the parametric 
representation 


(1) x = a(t — sin t), y = a(l — cos t). 


By eliminating the parameter ¢ we can obtain the equation of the 
curve in nonparametric form, at the cost, however, of neatness of 


y 


Figure 4.2 Cycloid. 


expression. We have 


re a a kaaret mene 
a a a” 
and hence 
a a 
(la) a = a arc cos T F Vya — y), 
a 


thus obtaining x as a function of y. 
Epicycloid 


Our next example is that of an epicycloid, defined as the path of a point P 
fixed on the circumference of a circle of radius c, as it rolls at a uniform 
speed along the circumference and outside of a second circle of radius a. Let 
the fixed circle be centered at the origin of the x,y-plane. Suppose the moving 
circle is rolling along the fixed one in such a way that its center has rotated 
about the origin to an angle ¢ at time t (Fig. 4.3). Then we find for the 
position at the time ¢ of the point P = (x(t), y(t)), which at the time f = 0 is 
the point of contact (a, 0), the parametric equations 


a+c 
a(t) = (a + c) cost — ccos ( 7 r), 
(2) 


_ fate. 
KD = (a + sins = esin ( z r). 
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Figure 4.4 Cardiod. 
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When a = c, the curve formed is called a cardiod, (Fig. 4.4) and is given by the 
parantetric equations 


(3) 


x(t) = 2a cos t — a cos (2t), 
y(t) = 2a sin t — a sin (2t). 


A third variety of cycloids is obtained as the locus of a point attached to the 
circumference of one circle rolling along the circumference of another 
fixed circle, but interior to it. To find the parametric equations for this 
“hypocycloid,” let a be the radius of the fixed circle and c that of the rolling 
circle. Let the point P on the circumference of the moving circle be located 


y 


Figure 4.5 Hypocycloid. 


at (a, 0) at time ¢ = 0. Suppose that the rolling circle is moving along the 
fixed one in such a way that at time 4 its center has rotated about the origin 
through an angle ¢ (Fig. 4.5). Then we find the parametric equations for the 
hypocycloid to be 


| a— c 
a(t) = (a — c) cost + c cos i ry, 


(4) 


y(t) = (a — c) sin t — c sin (‘ =). 


In the special case when the fixed circle has twice the radius of the moving 
one, c = }a, we find 


x(t) 
y(t) 


a cos t, 
0, 
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and the hypocycloid degenerates into the diameter of the fixed circle, de- 
scribed back and forth. The interesting feature of this example is that it 
provides a mechanical solution of the problem of drawing a straight line by 
using merely circular motions (Fig. 4.6). 


Figure 4.6 A point P on the rim of a circle rolling inside a circle of twice the radius 
describes a straight line segment. 


If the radius of the fixed circle is three times that of the moving one, then 
c = a/3, and 
a(t) = $a cos t + ła cos (2t), 


y(t) = $a sin t — ła sin (2t). 
By an elementary computation we find 
€? + y? = fa? + ĝa? cos (3t), 


so that the hypocycloid meets the fixed circle at exactly three points and the 
curve appears as shown in Fig. 4.5. 


Trochoids 


More general curves called trochoids (epitrochoids, hypotrochoids) are 
obtained if we consider the motion of a point P attached to a circle (but not 
necessarily on its rim) when that circle rolls along a straight line or along the 
outside or inside of another circle (see Fig. 4.7). The same type of curve 
arises as the path of a point moving uniformly on a circle while the center of 
the circle itself moves uniformly along a line or circle. These curves play a 
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Figure 4.7 Trochoid. 


central role in the Ptolomaic description of the apparent motion of the 
planets. 

Some of the remarkable properties of cycloids will be discussed later on in 
this chapter (p. 428). 


d. Classifications of Curves. Orientation 
Definitions 


Among the most obvious features of a curve are the number of 
separate pieces or branches and the number of loops which it has. A 
hyperbola is an example of a curve consisting of two disjoint branches; 
another such example is the curve y? = (4 — x®)(x? — 1) which consists 
of two separate ovals. We shall be concerned mainly with curves 
consisting of one piece, the connected curves. A connected curve can 
intersect itself, like the trochoid (Fig. 4.7) or the “lemniscate” of Fig. 
153, p. 103. 

A connected curve without self-intersections is called simple. Among 
the simple curves we can still distinguish the closed curves, such as 
circles or ellipses, from the ones that are not closed, such as parabolas 
or straight-line segments. We shall not attempt here to give either a 
rigorous or a complete classification of curves, but only point out 
certain “topological” features of a curve relevant for parameter 
representation. 


Simple Arcs 


A parameter representation of a curve C by two continuous functions 
x = d(t), y = y(t) defines a mapping of the f-axis or of a portion of it 
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onto C. We call C a simple arc if it can be represented in such a way 
that the parameter ¢ describes a closed interval [a,b] on the t-axis, 
forming the domain of the functions ¢(t), y(t), and if in addition 
different ¢ in the interval correspond to different points P on C. An 
example is the parabolic arc z = t, y = t? forO<t< lI. 

The same arc C (that is, the same points in the plane) can be repre- 
sented parametrically in many ways. Any monotone continuous 
function 7 = y(t) fora < t < b defines a parameter 7 such that x and y 
are continuous functions of 7 in a suitable closed interval [«, f], 
different values of 7 corresponding to different P. As a matter of fact, 
as is easily seen the continuous monotone substitutions 7 = y(t) 
provide the most general continuous parameter representations of a 
simple arc that assign different points of the arc to different parameter 
values. (See the remarks on p. 55 about one-to-one continuous 
mappings.) 

To a special parameter representation x = x(t), y = y(t) of a simple 
arc C belongs a definite sense on C corresponding to the direction of 
increasing t. Given any two distinct points Py, P, we say that P, 
follows P, if P, belongs to the larger value of the parameter r. If we 
introduce a new parameter r by a continuous increasing function 
T = y(t) the order of the pairs of points with respect to 7 is the same; 
the parameter 7 defines the same sense on C. If y(t) is decreasing, the 
sense is reversed. 


Direction or Orientation of Arcs 


A directed or oriented simple arc is one on which a definite sense has 
been selected (for example, that sense corresponding to an increase in 
a particular choice of the parameter f); that sense is then called the 
positive sense on the arc. The positive sense is completely specified, if 
we know which of the two end points of the arc follows the other one. 
We call the end point that follows, the final point of the arc, and the 
other one the initial point. Given any parameter representation 
x = x(t), y = y(7) of the oriented arc, where a < 7 < b, the positive 
sense will be that of increasing 7 if the parameter value 7 = a corre- 
sponds to the initial point and 7 = b to the final point; otherwise the 
sense of increasing 7 will be the negative sense on the arc (Fig. 4.8). 

Any two distinct points Py, P, on a simple arc C define a sub-arc 
with end points Py, Pı, which consists of the points with parameter 
values between those of P, and P,. If C is a directed arc and P; follows 
P, in the positive sense on C, we obtain a directed sub-arc with initial 
point P, and final point P,. A finite number of points of subdivision 
on a directed simple arc C breaks up that arc into a sequence of directed 
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Figure 4.8 Sense and parameter representation. 
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sub-arcs, the initial point of one sub-arc being the final point of the 


preceding one. 


Often it is impractical to restrict oneself to simple arcs and to insist 
that different parameter values ¢ shall belong to different points of the 
curve. If, for example, the equations x = z(t), y = y(t) give the posi- 
tion of a moving particle P at the time ¢, there is no reason why the 
particle should not stand still for a while or why its path should not be 
allowed to cross itself so that the particle returns to the same position 


at a later time. 


Figure 4.9 A curve with a loop: z = t? — 1, y = t? — t with sense of increasing £. 


336 Applications in Physics and Geometry Ch. 4 


An example is the curve x = 1? — 1, y = t — t [which also could be 
described completely by the cubic equation y? — x1 + x) = 0]. As t 
varies from — œ to + œ the curve crosses the origin twice, fort = —1 
and t= +1 (Fig. 4.9). We verify easily that all other points of the 
curve belong to a unique value of ¢. Geometrically, the interval 
—1 <t < +1 corresponds to a Joop of the curve. Here again the sense 
of increasing ¢ defines a certain order among the points of the curve, at 
least if we visualize in some way the points corresponding to t = —1 
and ¢ = +1 as distinct, one lying “on top” of the other one. The 
whole oriented cubic curve can be decomposed into directed simple 
arcs, for example, into the arcs corresponding ton < t < n + 1, where 
n runs over all integers. 


Closed Curves 


The standard example of a parameter representation in which 
different t correspond to the same point on the curve is given by the 
formulas 

x=acost, y=asint, 


which describe the uniform motion of a point on a circle with ¢ as the 
time. As ¢ varies from — œ to +œ the point P = (x, y) describes the 
circle infinitely often in the counterclockwise sense. We can cause 
the points of the circle to be described exactly once by restricting f to 
any half-open interval of length 27: 


a<t<ad2rz. 


The end points x and « + 27 of the interval correspond to the same 
point on the circle. Here the end points of the parameter interval have 
no special geometrical significance for the curve. 

Generally, a pair of continuous functions x = ¢(1), y = y(t) defined 
in a closed interval a < t < b will represent a closed curve if ¢(a) = 
(b), y(a) = y(b). The closed curve will be simple if different ¢-values 
with a < t < b correspond to different points (x, y). 

The point corresponding to t = a and t = b could be any point on 
the curve; it is just the point at which we “break” the curve to make its 
points correspond to those of an interval on the axis. 


Closed Curves Represented by Periodic Functions 


Just as in the example of the circle we can avoid distinguishing any 
particular break by taking for (ft) and y(t) periodic functions with 
period p = b — a. It is of value here to make some general remarks 
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about periodic functions to which we will turn more extensively in 
Chapter 8. 

A function f(t) is called periodic with period p if it is defined for all ¢ 
and satisfies the equation f(t) = f(t + p). Thus, for example, the 
trigonometric functions sin? and cost are periodic with period 2r. 
(Any multiple 2x7 where n is an integer is then also a period.) Geo- 
metrically interpreted f(r) has the period p if a shift of its graph by p 
units to the right leads to the same graph again. 


\a t b/ t=ť +2p 


Figure 4.10 Graph of a periodic function f(t). 


Since then f(t) “repeats” itself, a function f(r) of period p is determined 
for all ¢ if it is known merely in a single interval a < t < b of length 
p=b-—a (Fig. 4.10). Indeed for every ¢ there exists a value z’ in the 
interval a < t’ < b such that ¢ — t' = np, where n is an integer [one 
only has to take for n the largest integer that does not exceed (¢ — a)/p]. 
Then f(t) = f(t) is known. 

As a matter of fact we can start with any continuous function f(t) in 
a half-open interval a < t < b; the extended function will clearly be 
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a-p a b=a+p a+2p a+3p 
Figure 4.11 Periodic continuation of a function f(t) from the interval a ¢ t < b. 


continuous for all t which are not of the form t =a + np with an 
integer n (Fig. 4.11). 

For example, extending the function f(r) defined by f(t) = t for 
0 < z< l periodically, leads to a function of period p = 1 which we 
can call the ‘‘fractional part of t,” and which is discontinuous at points 
t which are integers (Fig. 4.12a). Generally, at t = a + np the periodi- 
cally extended function f will have the value f(a); this will also be the 
limit of fon approaching the point from the right, whereas the limit of f 
from the left will be the same as at the point b. In the case of greatest 
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interest to us at present we start out with a function defined and 
continuous in the closed interval a < t < b which moreover has the 
same value in the end points f(a) = f(b). Continuing such a function 
periodically always leads to a function f(t) of period p = b — a which 
is continuous for all ¢. (Fig. 4.120.) 

Continuous periodic functions are ideal for representing closed 
curves C. Let C be given parametrically by x = 4(t), y = y(t), for 4, y 
continuous in the interval a < t < band having the same values in both 
end points. We can extend the definition of these functions to all values 


f(t) 


(b) 


Figure 4.12 Periodic continuation of functions f(t) from the interval 0 <2: <1. 
Here (a) f(t) = t, (b) f(t) = 2t — 22. 


of t in such a way that ¢ and y have period b — a = p and are con- 
tinuous for all ¢. For any ¢ the extended parameter representation only 
yields points of C, since we have t = t' + np with n an integer and 
a < t' <b. The point corresponding to ¢ is then the same as the one 
corresponding to f’, which lies on C. As t varies from — œ to + œ the 
point (x, y) traverses the curve C infinitely often, just as in the circle 
x =a cost, y =asint. Here the distinguished role of the parameter 
value t = a is removed. For any «æ the whole curve is already repre- 
sented by x = ¢(t), y = u(t) when f runs from « to « + p. 

A portion of the closed curve C corresponding to the parameter 
values ¢ in an interval « < t < B forms a simple arc if different t-values 
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in that interval lead to different points (x, y). The whole closed curve C 
is a simple curve if different ¢ in the same interval x < t < a + p 
always lead to different points on C. Thus any closed parameter 
interval of length less than p gives a simple arc. 


Closed Curves Composed of Simple Arcs. Order of Points 


The closed curves which we shall consider can all be decomposed 
into simple arcs. If the whole closed curve C is simple, it can be 
decomposed into two simple arcs tọ <t <ñ and h4 <Ct<tt+p 
which have only their end points Py, P, in common. The sense of 
increasing t determines a positive sense or orientation on C by fixing a 
positive direction on each simple arc of C. Any two distinct points Po, 
P, on the simple closed curve C divide C into two simple arcs. In the 
sense of increasing t exactly one of the two arcs will have Po as the initial 
point and P, as the end point; we will call it PP, : the reverse holds for 
the other arc. 


Orientation and Order 


The positive orientation of C can also be characterized by an ordered 
triple of points PyP,P, of C if we specify that P, does not lie on the 
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Figure 4.13 Orientation of closed curves in the sense of increasing ¢. 
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simple directed arc with initial point P, and final point P,. The triples 
P,P,P) and P,P,P, obtained by a cyclic permutation from P,P,P, 
describe the same orientation (Fig. 4.134). 

*Quite generally, any n distinct points on the oriented closed simple 
curve C always follow each other in a certain order P,P, -- - P, deter- 
mined up to cyclic permutations', and divide C into directed simple 
arcs, P,Po,..., P,iP,, P,P; We can always choose parameter 
values t,, f,,...,7, for the points P,, Pa, . . . , P,, such that the ż; form 
a monotone increasing sequence and are all contained in one and the 
same parameter interval of length equal to the period p (Fig. 4.135). 


Orientation of Curves and Angles 


As already emphasized in Chapter 1 we are forced to make use of 
the sign plus or minus to establish satisfactory relations between 


D 
Figure 4.14 Angle of inclination ¢ of a direction D. 


geometric entities and analytic concepts expressed by numbers. Directed 
lines, such as the number axis, are the simplest instances. Which 
direction on a line we define as positive is arbitrary at the beginning. 
A positive sense corresponding to increasing ¢ can be associated with 
any particular parameter representation x = at + b, y = ct + d of the 
line. A line oriented in this way points in a certain direction. Two 
parallel directed lines have either the same or the opposite direction. 
A direction can also be determined by a ray issuing from a point Pp, 
that is, by a half-line which consists of the points on a line which 
“follow” a given point P, im the positive sense. 


1 That is, P Pa: ° PaPa, PaPa: + PaPiPs,.-., PaPa: til Pai give the same orien- 
tation. 
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Any direction in the plane can be represented by a ray from the origin 
or also by the point P on the circle of radius 1 about the origin that lies 
on that ray. If we represent this unit circle parametrically by x = cos t, 
y = sin f, we have associated with every direction certain values t, 
differing from each other by multiples of 27. We call them the angles 
of inclination of the direction or the angles the direction makes with the 
positive x-axis. There is always exactly one angle of inclination ¢ for 
which 0 < t < 2r (Fig. 4.14). 

The angles between two directions are simply the differences of their 
angles of inclination. More precisely, since the order in which we take 
the two directions matters, we say that a direction with inclination t’ 
forms with a direction with inclination t” an angle a = t' — t” (Fig. 4.15). 


D” 


Figure 4.15 Angle « the direction D’ forms with the direction D”. 


Since z and f’ can be changed by integral multiples of 27, the same change 
is permissible for the angle one direction makes with another one. 


Sense of Rotation 


We also say that the direction with angle of inclination t” passes into 
that with direction z’ by a rotation through the angle «. The intuitive 
idea of rotation here is that of a continuous motion, by which the direc- 
tion with inclination t” goes into that with inclination ¢’ by passing 
through directions with all possible inclinations ¢ intermediate between 
t” and t. We call the rotation positive or counterclockwise if « = 
t’ — t" is positive, and negative or clockwise in the opposite case. Of 
course, there are many different rotations both clockwise and counter- 
clockwise that will take a given direction into another given one unless 
we insist that the angle of rotation « satisfies —7 < a <T. 
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Ultimately then, the positive sense of rotation is associated with 
a particular parameter representation x = cost, y = sinf of the 
circle which we have chosen. If as usual, the z-axis points to the 
right and the y-axis upwards, then the positive sense of rotation coin- 
cides with the sense opposite to that of the hands on a conventional 
clock.} 


Positive and Negative Sides of a Curve 


A curve separates the points of the plane near one of its points P 
into two classes. Locally at least we can distinguish two “sides” of 
the curve. If the curve C is oriented, we can 
define a positive (or “‘left’”) and a negative 
(or “right”) side? as follows: Consider a ray 
issuing from P. We say that this ray points to 
the positive side of the curve if there are points 
Q on the curve arbitrarily close to P and 
following P in the sense given to the curve, 
such that the angle through which a line from 
P to Q must be rotated in the counterclock- 
wise sense to reach the given ray, lies between 
0 and v (Fig. 4.16). The points on the ray 
close to P are then said to lie on the positive 
side of the curve. In the opposite case the 
ray is said to point to the negative side of C, 

+ and the points on it are said to lie on the 
Figure 4.16 Positive and negative side of the curve. If the curve C is 
negative side of oriented . a ; 
arc. a simple closed curve, it divides all points 

of the plane into two classes, those interior 
to C and those exterior to C.” We say that C has the counterclock- 
wise orientation if its interior lies on the positive (that is, left) side 
(Fig. 4.17). 

If the closed curve C, however, consists of several loops, then it is 
not always possible to describe C so that all enclosed regions are on 
the positive side of C (see Fig. 4.18). 


— 


1 This sense, in turn, is suggested by the motion of the shadow on the ground in a 
sun dial in the northern hemisphere. 

? The terms “left” and “right” side correspond to the ordinary usage of the words 
“left bank” and “right bank” for a river oriented by its direction of flow. 

3 These concepts as well as the division of the plane by a simple closed continuous 
curve into two parts are analyzed precisely in topology and must be accepted here 
on an intuitive basis. 
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Figure 4.17 Simple closed curve with counterclockwise orientation. 


+ 


Figure 4.18 


e. Derivatives, Tangent and Normal, in Parametric Representation 
Direction and Speed 
For a curve C given in parameter representation with the time 
parame! gaa = 40, y= Hl) = vd) 
we denote the derivatives, as Newton did, by a dot: 
d@_g¢ gaS 


aa 

The derivatives 2, y are often conveniently visualized as the “velocity 
components” or the “speeds” of the coordinates of a point P moving 
along C. 

Whenever x # 0, it is possible to represent the corresponding portion 
of C by an equation y = f(x) by first calculating ¢ as a function of x 
from the first equation and then substituting the resulting expression 
for t into the second equation. By the chain rule of differentiation and 
the rule for the derivative of the inverse of a function (see p. 207) we 
find then for the slope of the tangent to the curve 


dy 
dt 
The equivalent formula dx/dy = /y holds if y # 0. 
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Unless the contrary is stated we always assume that ż and ġ do not 
vanish simultaneously or, concisely written, we assume 


2 4 j? O. 


Then the tangent always exists;? it is horizontal if y = 0 and vertical 
if z= 0. 
For the cycloid, for example, [see Eq. (1), p. 329] we have 


z= a(l = cos 1) = 2a sin? >, 


. í act t 
y = a sın t = 2a sin — cos —, 
2 2 


ay = cot is ‘ 

dx 2 
These formulas show that 27+ y? #0 except for t=0, +27, 
+47,.... Moreover, the cycloid has a cusp (that is, a point where it 


reverses direction), with a vertical tangent at those exceptional points 
at which it also meets the x-axis, that is, when y = 0; for on approach- 
ing these points, the derivative y’ = y/* = cot (1/2) becomes infinite. 


Tangent, Normal, and Direction Cosines 


The equation of the tangent to the curve at the point x, y is 
dy 
y — y = ~ (§ — 2), 
dx 


where € and » are the “running” coordinates corresponding to an 
arbitrary point on the tangent, whereas x, y, and dy/dx have the fixed 
values belonging to the point of contact. Substituting y/# for dy/dx 
we can write the equation of the tangent in the form 


(5) (£ —a)y¥y—-(n — yt = 0. 


Exactly the same equation is obtained under the assumption, y # 0; 
we only have to express x as a function of y. In the exceptional points 
where both z and y vanish for the same ¢ the equation becomes 
meaningless, since it is satisfied for all £, 7. 


1 We observe that the condition +? + y? ¥ 0, although sufficient, is not necessary 
to guarantee a nonparametric representation. Thus we may define the curve y = x? 
by means of the parametric equations x = 1°,y = t°. At the origin of the t-axis, the 
condition of positivity for +? + y? fails, but still the curve has a definite and well- 
defined nonparametric representation. 
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The normal to the curve, that is, the straight line through a point of 
the curve perpendicular to the tangent at that point, has the slope 
—dx/dy. This leads to the equation 


(6) (5 — x)t+(y — yy =0 


for the normal. 

If a point of C corresponds to several values of r, then in general a 
different tangent exists for each of the branches of the curve passing 
through the point, or for each value of r. For example, the curve 
x = {? — 1, y = t? — t (Fig. 4.9, p. 335) passes through the origin for 


t= —]l and t= +1. For t = —1 we find for the equation of the 
tangent € + 7 = 0, whereas the tangent for t= +1 is given by 
E—n=0. 
From the definition of derivative we have 
dy == tan x, 
dr è 


where « is the angle the tangent makes with the x-axis. This means a 
rotation by the angle a applied to the x-axis (counterclockwise if 
a > 0, clockwise if « < 0) will cause it to be parallel to the tangent. 
Rotations by the angles a + 7, a + 27,... will then also make the 
x-axis parallel to the tangent. Hence the angle « is determined only to 
within a multiple of 7, whereas tan a is determined uniquely. From 
the relations y/z = (sin «)/(cos a) and 2? + y? Æ 0 we find 
+ soi a sina = + ae 
V +g? VÈ +’ 
(where the same sign must be taken in both formulas). We call cos « 
and sin « the direction cosines of the tangent." 


COS x = 


Assigning Directions to Tangent and Normal 


The two possible choices for the direction cosines correspond to the 
two directions in which we can traverse the tangent; the corresponding 
angles a differ by an odd multiple of m. One of the two directions on 
the tangent corresponds to increasing t, the other one to decreasing t. 
Assume that the sense on the curve is that of increasing t: Then, by 
definition, the positive direction on the tangent, or the one that corre- 
sponds to increasing values of z, is the one that forms with the positive 


1 One thinks here of sin « as cos f, where f = 7/2 — « is the angle the y-axis forms 
with the tangent. | 
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z-axis an angle « for which cos « has the same sign as ż and sin « the 
same sign as y. The direction cosines of that direction on the tangent 
are then, without ambiguity, 
(7) cos a = ~, E EE 

Vi? + g? È + y? 
If, say, ¢ = dz/dt > 0, then the direction of increasing ¢ on the tangent 
is that of increasing x; the angle that direction forms with the positive 


J 


Tangent 


Normal 


Figure 4.19 Positive tangent and normal of an oriented curve. 


z-axis has then a positive cosine. Similarly, that normal direction 
obtained by rotating the direction of the positive tangent corresponding 
to increasing ¢ in the positive (counterclockwise) sense by 7/2 has the 
unambiguous direction cosines 


T T 
cos a +7) =, sin (a +2) = 2 
| 2) Ve + yy? 2s Vti 


It is called the positive normal direction and points to the “positive 
side” of the curve (Fig. 4.19). 

If we introduce a new parameter 7 = y(t) on the curve, then the 
values of cos « and sin « stay unchanged if dr/dt > 0 and they change 
sign if dr/dt < 0; that is, if we change the sense of the curve, then the 
positive sense of tangent and normal likewise is changed. 
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Critical Points 


If x and y are continuous and x? + y? > 0, the quantities cos « and 
sin x which determine the direction of the tangent will vary continuously 
with t. The tangent, whose equation is 


(£ — x) sina — (yn — y) cos « = 0, 


then changes continuously along the curve, as does the normal. 

If both žż and y vanish for a certain value of t, the direction cosines 
of the tangent are not defined by our formulas; a tangent may fail 
to exist altogether or it may not be determined uniquely. Such a point 
is called a “critical? point or a “stationary” point. We illustrate by 
examples various possibilities that arise at critical points. 

One example is furnished by the curve y = |x| with the parameter 
representation x = £t, y = |t|’; this curve has a corner for t = 0 
although both z and ¥ stay continuous. In the example of the cycloid, 
discussed on p. 344, the “‘stationary”’ points at which z = y = 0 corre- 
spond to cusps. On the other hand, the vanishing of č and 4 in some 
cases is merely inherent in a specific parameter representation and not 
connected with the behavior of the curve, as for the straight line repre- 
sented by x = 7°, y = t? for the parameter value t = 0. 


Corners 


Curves consisting of several smooth arcs meeting at corners are 
represented conveniently in parameter representation by functions z(t), 
y(t) which are continuous but have derivatives z, y with jump dis- 
continuities. This is illustrated by the trivial example of the broken 


Figure 4.20 Graph of x = t, y = 4(t + |t). 
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line represented by 


and 
TSI y=t for ¢>0. 


Here ż = 1, ġ = 0 for t < 0 and ż = 1, ýy = 1 fort>0. Atr=0 
the tangent is indeterminate (see Fig. 4.20). 


f. The Length of a Curve 
The Length as an Integral 


Two different types of geometrical properties or quantities are 
associated with curves. The first type depends only on the behavior of 
the curve in the small, that is, in the immediate neighborhood of a 
point; such properties are those which can be expressed by means of 
derivatives at the point. Properties of the second type or properties in 
the large depend on the whole configuration of the curve or of a portion 
of the curve, and are usually expressed analytically by means of the 
concept of integral. We shall begin by considering a quantity of the 
second type, the length of a curve. 

Of course, we have an intuitive notion of what we mean by the 
length of a curve. However, just as in the classical case of circular arcs, 
a precise mathematical meaning must be given to the intuitive concept. 
Guided by intuition we define the length of an arbitrary curve as the 
limit of the lengths of approximating polygons, in particular, inscribed 
polygons. The lengths of polygons, in turn, are immediately defined as 
soon as a unit of length is chosen. The final result will be the expression 
of length by an integral. 

We assume our curve given in the form x = z(t), y = y(t), a< 
t < B. In the interval between « and f we choose intermediate points 
fis fa,- - . , a1 Such that 


Q=y<ch<e< <4 lt, =B. 


We join the points Po, P,,..., P, on the curve corresponding to these 
values f,; in order, by line segments, thus obtaining an inscribed polygon. 
The length of the perimeter of this inscribed polygon depends on the 
way in which the points ¢,, or the vertices P, of the polygon, are chosen. 
We now let the number of the points f; increase beyond all bounds 
in such a way that at the same time the length of the longest subinterval 
(ti t;41) tends to zero. The length of the curve is then defined to be the 
limit of the perimeters of these inscribed polygons, provided that such 


Sec. 4.1 Theory of Plane Curves 349 


a limit exists and is independent of the particular way in which the 
polygons are chosen. When this assumption (assumption of recti- 
fiability) is fulfilled, we can speak of the length of the curve. 

We assume that the functions z(t) and y(t) have continuous derivatives 
a(t) and y(t) for « < t < $. The inscribed polygon corresponding to 
the subdivision of the f-interval by points rt; with Ar, = f,,, — tf; has 
vertices P, = (2(7,), y(7,)); its total length is given by the expression 


n=l a E E A A i 
S = 2 P,P; = ACOR — z(t)? + [Wtisa) — y(t)? 


according to the theorem of Pythagoras (cf. Fig. 4.21, p. 356). By the 
mean value theorem of differential calculus 


thti) — x(t;) = 2(&,) At, Y(t) — Y(t.) = Yy,) At, 


where &, and », are intermediate values in the interval t; < £ < liyi 
This leads to the expression 


S, =5 VEEE + Wak At, 


for the length of the polygon, where we have made use of the fact that 
the differences Ar; are positive. If the number n of points of subdivision 
t; increases beyond all bounds while at the same time the largest value 
At, tends to zero, the sum S, tends to the integral 


BS oa 
L=| Ve + pdt. 


This fact is a direct consequence of the existence theorems for integrals 
in Chapter 2.’ 

This proves that for continuous x, y the curve actually has a length 
and that this length is given analytically by the expression 


| A 
(8) L=| Ja? + y dt. 


The same is true if # and ¥ are allowed to be discontinuous at isolated 
points, where then the curve may not have a unique tangent; the 


1 Since the intermediate points £; and 7, need not coincide, we make use of the more 
general approximating sums that were shown to converge to the integral on p. 195. 
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integral of course must then be considered as an “improper” one (see 
Chapter 3, p. 301). More general “rectifiable’’ curves, for which our 
integral is meaningful, will not be discussed in this volume. 


Alternative Definition of Length 


We add an interesting observation: The perimeter S of any inscribed 
polygon ~ can never exceed the length L of the curve. (In particular, the 
distance of the end points of the curve cannot exceed L; for the straight 
line joining the end points is the shortest curve joining those points.) Indeed 
we may obtain L as limit of the perimeters of a special sequence of inscribed 
polygons, in which we start with the polygon = of perimeter S and obtain 
the following ones by adding successively more and more vertices. Inserting 
an additional vertex between two successive vertices of an inscribed polygon 
can never lead to a decrease in perimeters, because one side of a triangle 
can never exceed the sum of the other two. Thus L is the limit of a non- 
decreasing sequence of perimeters that starts with S. Hence S < L. Instead 
of defining therefore L as limit of the perimeters of a sequence of inscribed 
polygons corresponding to finer and finer subdivisions of the f-interval, we 
could also have defined L as the least upper bound of the perimeters of ail 
inscribed polygons. It is interesting that the length can be defined without 
formally invoking any passage to the limit. 


Invariance of Length under Parameter Changes 


From its definition it is clear that the length L of a curve c cannot 
depend on the particular parametric representation we use for C. 
Hence, if we introduce a new parameter r = y(t), where d7/dt > 0, 
our integral formula for L must give the same value whether f or r is 
used as parameter. This can be verified immediately from the chain 
rule of differentiation and the substitution law for integrals. We have 
indeed 


—_——— dn (2) J (#2 a (tz =a 
5 —— p —] + |—ų—] = ——} + {= 
Car (= di dr dt AFT 


JE; 
7 J F Ea 
hence, if y(«) = a, %(ß) = b, 


E ET e | f(dx\? dy¥ dr 
|, very aN Nae!) ” Nae! at 
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so that the expression for length based on the parameter 7 leads to the 
same value L. If, instead, dr/dt < 0 we find similarly 


(area [Fs le 
VENCE 


the right-hand side is again the correct integral for the length of C 
referred to the parameter 7 since now b < a because x(t) is a decreasing 
function. 

For a curve given nonparametrically by a function y = f({x), 
a < x < b, we can introduce x as parameter ¢. Then = 1, y = dyļdz. 
The length of the curve is then given by 


b 2 
(9) L=f J + (2) dx 
a dx 
Examples. As an example we find for the length of a segment of 
the parabola y = $2? corresponding to the interval a < x < b: 


ee 
-Í VI + 2 dz. 
Here the substitution x = sinh ż (see Chapter 3, p. 273) leads to 


ar sinh b ar sinh b 
i cosh? t dt = $ (1 + cosh 2t) dt 


ar sinha ar sinha 
ar Sinh p 


= 4(t + sinh t cosh t) 
ar sinha 
= (ar sinh b + bV1 + b? — ar sinh a — av'1 + a’). 
For a curve given by an equation r = r(0), a <0 < 6 in polar 
coordinates, we have the representation x = r(0) cos 0, y = r(0) sin 0. 
Choosing 0 as parameter, we have 


ż = ř cos ĝ —rsinĝb, y=frsinO+rcos#, 2+ y7 =r? + PF, 
This leads to the expression 
k dr\ 
10 L=| jes (2) dé 
oe. a dé 
for the length of a curve in polar coordinates. We have, for example, 
for the circle of radius a about the origin, the equationr = constant = a 


0 <6 < 2r. This gives for the total length of the circle 


27 


L=| a db = 2ra. 


0 
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Additivity of Length 


Let C be a curve given by z = x(t), y = y(t), x < t < B, where 
ż and y are continuous. Let y be any intermediate value between « and 
f. From the general rules for integrals we have 


a J a P- can 
[rtia] Vi? + x dt +f Vè + j}? dt. 
a a y 


The integrals on the right, respectively represent the lengths of the 
portions into which C is divided by the point corresponding to t = y. 
Hence the length of the whole curve equals the sum of the lengths of 
its parts. 

It is not necessary that ż and ý are continuous. The integrals exist 
just as well when ż and y have a finite number of jump discontinuities, 
as would occur in a curve with corners. The total length of the curve 
is then the sum of the lengths of the smooth portions between the 
corners. Even more singular behavior of « and y is permitted as long 
as the expression for the length is meaningful as an improper integral. 


g. The Arc Length as a Parameter 


We have seen that one and the same curve permits many different 
parameter representations x = z(t), y = y(t). Any monotone function 
of ¢ can be used as parameter instead of t. For many purposes, however, 
it is of advantage to refer curves C to some “standard parameter” 
which in some way is distinguished geometrically. The abscissa x or the 
polar angle 0 are not suitable for that purpose if curves are to be 
described in the large; moreover, they depend on the choice of 
coordinate system. The possibility of measuring lengths along a curve 
provides us with a natural geometrically defined parameter to which 
points P of a rectifiable curve can be referred, namely, the length of the 
portion of the curve between P and some fixed point Pp. 

We start out with an arbitrary parameter representation x = z(t), 
y = y(t), x <t <ßofC. Differentiation with respect to ¢ is indicated 
by a dot. We introduce the “arc length” s by the indefinite integral 


(11) s= VEF has 
or more precisely s as a function of t by 


(11a) e + CEET dr, 
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where c is a constant, fọ a value between « and f, and where we have 
writteh 7 for the variable of integration to distinguish it from the upper 
limit r. Clearly, for any values ¢, and f, in the parameter interval the 
difference 

rans 
(12) s(t) — s(t) = | Va? + xPdr 

ti 
is equal to the length of the portion of the curve bounded by the points 
corresponding to f = 4, and t = tz, provided 1, < t,. For ti > t, the 
difference s(t.) — s(t,) is the negative of the length of that portion. 
Thus the knowledge of any indefinite integral s permits us to calculate 
the length of any part of the curve. 


The Sign of Arc Length 


If the constant c has the value 0 we can interpret s(r) itself as the 
length of the arc of the curve (or the “distance along the curve’’) 
between the point P with parameter fy and the point P with parameter 
t; here the length is counted positive in the case where the arc with 
initial point Py and end point P has the orientation corresponding to 
increasing ¢.1 

The integral form of the definition of s is equivalent to the relation 


ds l (=) (24) 
12 — = — >=]. 
mee) dt dt ‘i dt 


Using the symbolic notation for differentials (p. 180) ds = (ds/dt) dt, 
etc., we can write this relation in the suggestive form 


ds = mi dx? + dy? 
for the “element of length” ds. 
Speed of Motion along a Curve 


If t is interpreted as the time and z(t), y(t) as coordinates of the 
position of a moving point at the time t, we have in 
hk ds _ lim C +h) — s(t) 
dt nh-+0 h 
the rate of change of the distance moved by the point along its path with 
respect to the time, that is, the speed of the particle. For a particle 


1 Notice that the variable s is not completely unique; it depends on the choice of 
P, and c and also on the orientation of the curve induced by the parameter t. How- 
ever, any other arc length is expressible in terms of s in the form (s + constant) or 
(—s + constant). 
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moving with uniform speed along the curve $ is a constant and s is 
a linear function of the time £t. 
If our usual assumption 


i? + 9? £0 


is satisfied, we have ds/dt = 0 and can introduce s itself as parameter. 
Many formulas and calculations then simplify. The quantities 


ds dtds VEF x ds dtds /# 4% 


are then just the direction cosines of the tangent pointing in the 
direction of increasing s (see (7), p. 346). The relation 


dx\ dy\* 
eA ds ds 
characterizes the parameter s as the arc length along the curve. 


h. Curvature 
Definition by Rate of Change of Direction 


We discuss next a basic concept which refers only to the local 
behavior of a curve in the neighborhood of a point, the concept of 
curvature. 

As we describe the curve, the angle « of inclination of the curve will 
vary at a definite rate per unit arc length traversed; this rate of change 
of « we call the curvature of the curve. Accordingly the curvature is 
defined as 


(14) k= —., 


Parametric Expressions. Let the curve be given parametrically by 
functions x = z(t), y = y(t) having continuous first and second 
derivatives with respect to ¢, for which 2? + 4? # 0. In calculating the 
rate of change of the direction angle « at the point P we have to take 
into account that « is not defined uniquely. However, the trigonometric 
function of «, tana = y/% (or cota = 2/y for = 0) has a definite 
value. In forming da/ds we can always assume that the parameter 
values belonging to points in a neighborhood of P all lie in an interval 
throughout which one of the quantities z, y stays different from zero. 
If, say, č = 0 we can assign to « a value that varies continuously with t 
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throughout the interval by taking 


a = a(t) = arc tan # + nz, 
x 


where n is a fixed, possibly negative integer, and “arc tan” stands for 
the principal value of the function (cf. p. 214), lying between —7/2 and 
7/2. Similarly, if y # 0 in the interval we can take for « the expression 


i ; 
a(t) = arc cot - + nr = S arc tan = + nz." 
y 2 y 
In either case we find by direct differentiation for any parameter 
representation 


de _ ty ëj 
dt f+ a 
Since (see (12a), p. 353) also 
pa Sa VP sR 
dt 


we obtain for the curvature da/ds = «/s of the curve the expression 


(15) K -2 EE N 
ds $ (z + y")? 


Choosing in particular, the arc length s as the parameter ¢ we have 
24+ y= 1 

[see Eq. (13), p. 354] and hence we obtain the simplified result 

(15a) k = LY — jë. 

Sign and Absolute Value of Curvature 


Intoducing a new parameter 7 = 7(t) instead of t does not affect the 
direction of the tangent, and hence, does not affect changes in a. 
Similarly, the absolute value of the difference of the s-values in two 
points has a geometric meaning independent of the choice of parameter, 
namely that of distance measured along the curve. However, the sign of 
the difference must always be taken as the same as the sign of the 
difference of the corresponding parameter values, since we defined s as 


1 We could define a(t) as a continuous function for all parameter values ¢ by dis- 
secting the whole parameter interval into subintervals in each of which either z # 0 
or ý #0. In each of the subintervals we can define then a(t) by one of the above 
expressions, choosing for each interval the constant integer n in such a way that the 
values of « in the common end point of two adjacent intervals, as determined from 
the expressions for those intervals, coincide. 
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to t ti titi tn=1 tn 


Figure 4.21 Rectification of curves. 


Figure 4.21(a) Curvature « = lim A«/^s of a curve. (In the case illustrated we 
have k < 0.) 
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an increasing function of t. Thus the absolute value of the curvature 
|x| = |da/ds| does not depend on choice of parameter, whereas the sign 
of « depends on the sense on the curve corresponding to increasing t. 
Obviously, x > 0 means that « increases with s, that is, that the tangent 
turns counterclockwise as we proceed along the curve with increasing s 
or f (see Fig. 4.21a). In this case the orientation of the curve C is such 
that the positive side of C also is the “inner” side of C, that is, the side 
toward which C curves. 


y 


Figure 4.22 Graph of a convex function f(x) (left) and concave function (right). 


If the curve is given by an equation y = f(x), we have, using x as 
parameter, 


y” 
i ET 

where y’ and y” are the derivatives of y with respect to the variable 
x. Here the sign of the curvature is that corresponding to increasing =. 
Obviously, x is positive for y” > 0; in this case the tangent turns 
counterclockwise as x increases; we call the function f(x) convex. 
The portion of the curve joining any two points lies below the straight 
line joining them. Fory” < 0 the tangent turns clockwise for increasing 
x, and the function f is called concave. (Fig. 4.22.) Here the curve lies 
above the chord joining two of its points. The intermediate case where 
the curvature has the value zero corresponds (generally speaking) to a 
point of inflection at which y” = 0 (see p. 237). 


Examples. For the curvature of the circle of radius a given by 
x =acost, y=asint we find the constant value l/a from the 
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general formula (15). Thus the curvature of a circle described in the 
counterclockwise sense is the reciprocal of the radius. This result 
assures us that our definition of curvature is really a suitable one; for 
in a circle we naturally think of the reciprocal of the radius as a measure 
of its curvature. 

A second example is the curve defined by the function y = 2°. 


The curvature is 
6x 


an (1 + 9a) ° 


For x < 0, the function y = z? is concave, since x < 0, and the tangent 
is turning in a clockwise sense, whereas at x = 0, we have a point of 
inflection, and for x > 0 the function becomes convex. 

A function whose curvature is identically equal to zero is a straight 
line as is easily seen by our definition, and the straight line is the only 
such curve. 


Circle of Curvature and Center of Curvature 


We introduce p = 1|/«. The quantity |p] = 1/|«| is called the radius of 
curvature at the point in question. (It is infinite at a point of inflection 
where «x = 0.) For a circle the radius of curvature at any point is just 
the radius of the circle. 

To any point P = (x, y) of the curve C we assign a circle tangent to 
C and P and having the same curvature as C when we traverse the 
curve and the circle in the same sense at P. This circle is called the 
circle of curvature of the curve C at the point P. Its center is the center 
of curvature of the curve C corresponding to the point P (Fig. 4.23). 
Since C and the circle have the same radius of curvature the radius of 
the circle must be the radius of curvature |p| of C, and the center (£, 7) 
of the circle must lie on the normal of C at P, and a distance |p| away 
from P. Since C and the circle curve toward the same side, the center 
lies along the normal direction to the curve at P, on the positive or 
negative side according as the curvature « is positive or negative. 

The direction from P to the center of curvature forms an angle 
a + 7/2 with the positive x-axis, if x > 0. Thus, if £, 7 are the coor- 
dinates of the center of curvature and x, y those of P, we have [see 
Equation (7), p. 346] 


EZE = cos (a +2) = — sina = Sh. 
p © 2 Vet y’ 
r RE a 
— = SIn |« 4 -] = cosa = ; 
p 2 Vi +}? 
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Figure 4.23 Circle of curvature I’ and center of curvature (£, 7) corresponding to 
point P of curve C. 


Hence for «x > 0, 
py px 
Ve + 7 Jit i 


If arc length s is used as parameter f, we obtain the simple expressions 


(17a) F=x— py, Nn=yrt pi. 


The same formulas for £, 7 are obtained for «x < 0, in which case the 
radius of curvature is —p and moreover the direction from P to the 
center forms an angle a — 7/2 with the positive z-axis. 


Circle of Curvature as Osculating Circle 


Formulas (17) give an expression for the center of curvature in terms 
of the parameter ¢ of the point Pon the curve. As t ranges over all values 
in the parameter interval the center of curvature describes a curve, the 
so-called evolute of the given curve; since, with x and y, we have to 
regard x, y, and p as known functions of t, the foregoing formulas give 
parametric equations for this evolute. Examples and a discussion of 
geometrical properties of the evolute will be found in Appendix I, 
p. 424. 

Any two curves are said to “osculate” at a point P or to have “con- 
tact of order two” at P, if they pass through P, have the same tangent 
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at P, and also the same curvature, when oriented the same way. 
Obviously, two osculating curves have the same circle of curvature 
and center of curvature at P. If the curves are given by equations 
y = f(x) and y = g(x) in nonparametric form, it is easy to express the 
condition that they have a point of contact P and the same tangent and 
curvature at P. If x is the abscissa of the point of contact P, we have 
f(x) = g(x), f(x) = g(a); the equality of curvature is expressed by 


Seer (eee a a 

DEE E g w 
and hence also f"(x) = g"(x). Thus the condition for a point of contact 
with equal curvatures is that the values of fand g together with those of 
their first and second derivatives agree at the point. 

Consider a curve C: y = f(x) and its circle of curvature I‘ at P 
represented by y = g(x) in a neighborhood of P. Since the circle F 
coincides with its circle of curvature, we see that C and I have the 
same circle of curvature, hence osculate at P. Consequently, at the 
point of contact f(x) = g(x), f'(x) = g'(x), f(x) = g"(z). We say this 
circle is the “best fitting” circle to the curve at the point P of contact, 
since no other circle meeting the curve at the point of contact has 
“contact of order two” with C at the point. The circle of curvature 
is the osculating circle. (See also Chapter 6, p. 459.) 

Incidentally, just as the tangent to a curve is the limit for P, > P 
of a line through two consecutive points P and P, on C, one can show 
that the circle of curvature at P is the limit of the circles through three 
points P, P,, Pa for P4 + P and P, — P. The proof is left to the reader. 
(See Problem 4, p. 437.) 


i. Change of Coordinate Axes. Invariance 


Properties inherent in a geometrical or physical situation do not 
depend on the specific coordinate system or “frame of reference” with 
respect to which they are formulated; the intrinsic character of prop- 
erties such as distance or length or angle must be reflected in state- 
ments showing that the respective formulas remain unchanged or are 
invariant if one passes from one coordinate system to another. A few 
brief remarks concerning this subject are appropriate in this section. 

We use the general equations connecting the coordinates x, y of a 
point P in one coordinate system with the coordinates ¢, 7 of the same 
point P in any other system. The relative position of the second set of 
coordinate axes to the fizst set is characterized by the coordinates a, b 
that the origin of the second system has in the first system, and by the 
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angle y which the positive é-axis makes with the positive z-axis. 
The coordinates (x, y) and ($, 7) of the same point in the two systems are 
(cf. Fig. 4.24) connected by the transformation 


x= &cosy—ysiny + a, 


(18) 
y = siny+ncosy +b. 


For y = 0 no rotation of the axes but only a parallel displacement or 
translation is involved, and the formulas take the simple form z = 
Etay=nts. 


Figure 4.24 Change of coordinate axes. 


Solving for £, 7 in terms of x, y we find 


$ = (x — a) cosy + (y — b) sin y, 


(184) n = —(x — a) sin y + (y — b) cos y. 


If x and y are functions of a parameter ¢ defining a curve, we obtain 
immediately from these formulas expressions for € and 7 as functions of 
t, giving the parameter representation of the same curve in the &,7- 
system. Differentiating with respect to ¢ (the quantities a, b, y which 
fix the relative position of the two coordinate systems do not depend 
on f) yields the transformation of the “velocity components,” that is, 


1 We restrict ourselves to “right-handed” coordinate systems in which the positive 
direction of the second axis of a system is obtained by a counterclockwise 90° 
rotation from that of the first axis. 
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for the derivatives of the coordinates with respect to t, 
è= cosy — ġsiny, y= &siny + ù) cos y.! 
We confirm 
i? +4? = E2 + 7, 


Thus the expression Vi + 4j? has the same value in all coordinate 
systems; this invariance property is, of course, obvious from the 
interpretation of this quantity as rate of change ds/dt of the length 
along the curve with respect to r. The reader may verify by an easy 


y 


Figure 4.25 Displacement of point P from position (z, y) to position (£, 7). 


calculation that also the expression x = (¢y — ¢y)(z? + 4j?) for 
the curvature is invariant. (This, of course, follows also directly from 
the fact that the angles the tangent makes respectively with the é- 
and z-axes differ only by the constant value y, so that k = da/ds 
cannot change.) 

Equations (18) relating the coordinates x, y to the coordinates £, 7 
are often interpreted in a different way as describing a displacement. In 
this interpretation the points P are shifted instead of the coordinate 
axes (Fig. 4.25). Only one coordinate system is used. The point with 


1 In some physical applications, where ¢ stands for time, the relative position of the 
two coordinate systems also depends on time; let the quantities x, y stand for the 
coordinates of a particle in a coordinate system that is at rest, whereas $, 7 are 
thecoordinates of the same particle referred toa moving coordinate system, forexample, 
axes that are attached to the moving earth. The functions 2(f), y(t) describe the path 
of the particle as it looks to an observer at rest, whereas &(r), n(t) describe the path 
as it looks to a moving observer. The formulas connecting ż, ý with &, ù have to 
include then also the obvious ierms arising from differentiation of a, b, and y. 


Sec. 4.1 Theory of Plane Curves 363 


coordinates (x, y) in that system is mapped onto the point with co- 
ordinates ($, 7) in the same system. Invariance of length or curvature 
of a curve now means that these quantities do not change when the 
whole curve undergoes a rigid motion. 


* j. Uniform Motion in the Special Theory of Relativity 


As pointed out on p. 234 there are far reaching analogies between the 
trigonometric and the hyperbolic functions which have their geometric 
counterpart in the correspondence between properties of ellipses and hyper- 
bolas. The relationship will become clear when we shall be able to define the 
trigonometric functions for an imaginary argument and to verify that 
cos (if) = cosh /, sin (it) = i sinh rt in Section 7.7a. As an application of this 
analogy we consider the “hyperbolic rotations” of the plane which can be 
identified with the Lorentz-transformations of a line in Einstein’s special 
theory of relativity. 

We saw in (18a), p. 361, that a rotation of coordinate axes by an angle y 
which leaves the origin fixed can be described by the equations 


(18b) §=xrcosy + ysin y, n = =r siny +ycosy 


connecting the coordinates x, y of a point P in the first system with its co- 
ordinates £, 7 in the second system. The distance of P from the origin is given 
by the same expression in both systems: 


OP = V ty = VEH. 


This follows also immediately from the transformation equations if we make 
use of the identity cos? y + sin? y = 1. 

We now consider the analogous transformation with coefficients that are 
hyperbolic instead of trigonometric functions: 


(19) € = x cosh « — t sinh q, 7 = —z sinh « + ż cosh «; 


these formulas can be obtained from the formulas (18b) for rotations by taking 
for the rotation angle yand the y-and ņ-coordinates, pureimaginary quantities: 


y = Ia, y = it, n = iT. 


We notice that for a real value of « (which would mean an imaginary 
angle of rotation y in the original interpretation) formulas (19) define $ and 
7 as real linear functions of x and t. These functions have the special property 
that 

& — 72 = (x cosh a — t sinh a)? —(—xsinha + t cosh x)? 


e —7 
as a consequence of the identity cosh? « — sinh? « = 1. (This follows, of 


course, also from the observation that x? — 1? = x? + y? is the square of the 
distance from the origin in the z,y-plane.) We now interpret ¢ as the time and 
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x as a space-coordinate describing the location of a point in a one-dimen- 
sional space, that is, on a straight line. Any event takes place at a certain 
point at a certain time. These two pieces of information are provided by the 
two numbers x, f giving respectively the (signed) distance x of the point from 
the origin O and the time ¢ that has elapsed from the time 0. In the theory of 
relativity we take the point of view that the measured values of this distance 
and of elapsed time depend on the frame of reference used by the observer, 
that is, on the special coordinate system in the space-time continuum. The 
quantities ¢, r obtained from the formulas (19) will describe the same event 
in a different frame of reference in which distances and lengths of time inter- 
vals can have different values. The quantity that is unchanged in the 
transition from one reference frame to another one (known as “Lorentz- 
transformation”) is 
VA = VE, 


the “‘space-time distance” of the event from the origin. For an observer 
using the second system the quantity £ is the space distance measured from 
the origin é = 0. That origin is a point for which 


xcosha —fsinhe =0 


or x/t = tanha. Thus the origin of the second system is a point which in the 
first system appears to move with uniform velocity v = dz/dt = tanh « 
relative to the origin of the first system. Hence the Lorentz transformation 
relates the values of distances and times as they appear for observers in two 
systems moving with constant velocity v relative to each other. Here 


sinha e% — e~’ 


?  cosha  e% +e 
lies necessarily between —I and +1 so that we are restricted to relative 
velocities of the two systems that lie numerically below the value 1. The value 
l] here represents for suitable choice of units the velocity ¢ of light which 
cannot be exceeded by v. 

For a constant u the equation x = ut corresponds to a point which in the 
first system moves with the velocity u, starting at x = Oat the time t = 0. In 
the second system the same point will have the velocity 


dé dé dr u — tanh a u—v 
= a (=) /(F) ~ l}—utanha 1 — w` 
This result, valid in Einstein’s special theory of relativity, differs from what 
we would obtain in classical kinematics where the velocity w of a point with 
respect to a system moving with velocity v would simply be given by œw = 
u — v. The relativistic formula shows that œ = u when u = +1 or — l; 
this corresponds to the fact suggested by the famous Michelson-Morley 


experiment that the velocity of light is the same for observers moving with 
different velocities. 
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k. .Integrals Expressing Area within Closed Curves 


In Chapter 2 the concept of integral was motivated by reference to 
“area under a curve,” that is, the area of a strip of special shape. This 
specialization to areas under a curve is not quite satisfactory since the 
areas actually encountered most frequently are those of domains 
inside closed curves C, and are of more general shape than the strips 


b 
whose area can be represented by integrals of the form | f(x) dx. 
The Basic Formula i 


We shall now derive an elegant general integral representation for 
the area bounded by a closed curve C which is given in parametric 
representation, by breaking up the area into special strip areas. This 
representation will be independent of the parameter representation and 
likewise independent of the coordinate system. Furthermore, it will 
express the oriented area within the curve in accordance with the sense 
of direction assigned to the boundary C; that is it will assign to an 
area within a simple closed curve C the negative or positive sign accord- 
ing as the sense of the boundary curve is clockwise or counter- 
clockwise. 

Assume that the simple closed oriented curve C is given by x = z(t), 
y = y(t), where ¢ varies over the interval « < ż < f and the sense 
of increasing ¢ determines the sense on C. We assume that x and y 
are continuous functions of t (with the same value at t = « and t = $) 
and that their first derivatives and y are continuous, with the possible 
exception of a finite number of jump-discontinuities if C has corners. 
Under these assumptions we shall prove the basic formula 


p B 1 (A l 
(20) A= -Í yi dt =| rj dt = 1 | (ey - yx) at 


a a a 


for the oriented area A within C. 

That the three integral representations in the formula are equivalent 
follows directly if we integrate the first one by parts and use the perio- 
dicity conditions x(a) = (£), y(x) = y(f); the third, more symmetric 
representation is just the arithmetic mean of the first two. 

The expressions (20) do not depend on the location of the coordinate 
system in the plane. In fact, the symmetric expression 


A 


1 B 
ICEL 
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shows clearly that the value of A is independent of the choice of the 
coordinate system. As we saw on p. 361 a change of coordinates from 
an ry-system to a én-system is achieved by a substitution of the form 


x= cosy —ysny +a, 
y= $sin y + ņcos y + b, 


with constant a, b, y. Differentiation of these formulas with respect to ¢ 
yields 
z= č cos y — ù) sin y, y= sin y + ù cos y 


and consequently, 
xy — yt = En — në + ay — bi. 


Thus the expression xý — yż is invariant under rotations about the 
origin (that is, when a = b = 0). Even when a or b do not vanish, 
the value of the integral for A is not affected, since 


B B 
| (ay — bz) dt = (ay — bx)| = 


ea 


for the closed curve C. 


Proof of the Basic Formula (20). Line Integrals over Simple Arcs. The 
basic formula (20) is proved in some easy steps. 

First, let C be a simple oriented arc with initial point P, and final 
point P,. Let x = x(t), y = y(t) be any parameter representation of C 
with Po, P, corresponding respectively to t = to, tı. (Here tọ may be 
larger or smaller than ¢,.) Then the integral 


ty dz 
A = — — dt 
Jož 


depends only on C and not on the particular parameter representation. 
This is an obvious consequence of the substitution rule; if we introduce 
a new parameter 7 by the monotone function r = x(t) where 7) = 
y(to), Tı = x(t) the corresponding integral is? 


7 ty ty 
-f yt ar=—{"y ocean, = -|'y 2a = A. 
To dr em dt to d 


1 We assume not only that z(t), y(t) but also r(t) are continuous functions and that 
their derivatives are continuous with the possible exception of a finite number of 
jump-discontinuities. 
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It is therefore justified to drop from the expression for the integral A 
the reference to any special parameter t and simply to write 


C 


Here Aç for a simple oriented arc C is to be computed by referring the 
arc to a parameter t, using dx = (dz/dr) dt, and taking as limits for the 
t-integration the parameter values for the end points of C in the order 
determined by the orientation of C.! 

If C’ is the arc obtained from C by changing its orientation, that is, 
the arc with initial point P, and final point P, we have, using the same 
parameter representation for C’, 


mda í dx 
T pe dt t dt $ 
Hence changing the orientation of an arc C changes the sign of the 
integral Ay. 
If the oriented simple arc C is broken up into oriented subarcs 
Cis Cy,..., Cn, each with the same orientation as C, we obviously have 


Ac = Ac, + Ac, tii + Aer 


For in a parameter representation of C where, say, the sense of C is 
that of increasing 7, this decomposition corresponds to a subdivision 
of the parameter interval tg < £t < t, for C into subintervals fy) < t < ti, 
h Ltr... taa Ltt, corresponding to C,,...,C,. The 
result then follows from the additivity of integrals. 

The additivity of the integrals A,, makes it much easier to compute the 
value of A,, in cases where C consists of several smooth arcs C}, 
C,,..., each with its own parameter representation. We do not need 
to construct artificially a common parameter representation for the 
whole curve C, but instead compute each Aç, separately from its 
parameter representation and then take the sum. Moreover, the Ag. 
can be added in any order; we only have to make sure that all C, 
have the same orientation as C. 


The Basic Line Integral for Closed Curves 


We can now define Aç for any oriented, simple closed curve C by 
breaking up C into simple arcs Cy,..., Cp with orientations agreeing 


pdx + q dy 


1 The integra | yde is an example of the general line inegrals | 
C 


C 
which will be discussed in Volume II. 
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with that on C and forming the sum of the Aç! If the whole closed 
curve C has the parameter representation x = z(t), y = y(t) for 
a < t < P, where the sense of increasing ¢ gives the orientation of C and 
where ¢ = « and t = ĝ correspond to the same point then Aq is again 
given by 


In the same way we can define Aç for nonsimple oriented curves C by 
decomposition into simple oriented arcs, even when C consists of several 
disjoint pieces, as long as each portion of C has a definite sense. 


The Basic Integral as Area 


We now turn to the main point; that is, we identify the expressions 
Aco for a closed curve with the intuitive geometric quantity of oriented 
area within C. 


Figure 4.25(a) Area of a “‘cell.” 


We consider first a domain G bounded from above by an arc C;: 
y = g(x) fora << x < b; below by an arc C3: y = f(x)fora <x <b: 
and laterally by line segments Cy, C, given by x = a and x = b (Fig. 
4.25a). Here C, and C, are permitted to shrink into points. If we give to 


1 That the value of A, obtained in this fashion does not depend on the particular 
way in which we divide up C into simple arcs follows easily: first the additivity 
property of A for simple arcs shows that refining a given subdivision by introducing 
additional dividing points does not change the resulting value of Ag; moreover, any 
two subdivisions can be replaced by one that is a refinement of both without changing 
the value of Ag. 
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C the counterclockwise orientation, the arc C, will be described in the 
sense of decreasing x, and the arc C; in that of increasing x. In forming 
Ac as the sum of the four Aç the portions C, and C, along which = is 
constant make no contribution since there dr/dt = 0. Using x as 
parameter on the arcs C, and C;, we find 


a b 
Ag = Ag, + Au, = -Í g(x) dx — f(x) dx 
b a 


b b 
feaz — f(x) dx. 


a 


This clearly is the positive area of the domain G, if G lies completely 
above the x-axis, being the difference of the areas lying respectively 
below the curves C, and C3. We can always guarantee that G lies 
above the axis by replacing y by y + c with a suitable constant c, 
that is, by a translation in the y-direction. This does not change areas 


and also does not affect the value of A, = -| y dx for a closed curve 
i 


C as we saw before. Hence for domains G of the type described which 
have a boundary C intersected in no more than two points by parallels 
to the y-axis, the integral Aç represents the area, taken positive if C is 
oriented counterclockwise, negative if clockwise. We obtain the same 
result for areas bounded by a curve C intersected by parallels to the 
x-axis in at most two points; we have only to write A,. in the form 


Í x dy and to interchange x and y in the preceding argument. We call 
E 


domains G of one of these two types “‘cells.’’ We shall talk of “oriented 
cells” when their boundary curves are given one or the other orientation. 

We now consider a domain G with the oriented boundary C, which 
is composed of a number of simple cells G,,G:,...,G, with the 
boundaries C,,...,C,, respectively; all these cells are assumed to 
have the same orientation, say counterclockwise. Then, as indicated 
in Fig. 4.26, the parts of the boundaries of the cells which are common 
to two adjacent cells are described in a different sense according 
as they are considered boundary arcs of one or the other of the 


adjacent cells. Therefore, if we add the intergrals Au = -| y dx for 
i 


the different cells, the contributions of all the interior cell boundaries 
cancel out and we obtain 


A=34c,=3(-[ var) = -f yar= Ac 
i=1 * i=1\ JO; C 


where A is the oriented area of the total domain G. 
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Thus the formulas (20) for the area A of an oriented domain G within 
a closed curve is proved for all domains which can be decomposed 
into simple cells, for example, by drawing parallels to the coordinate 
axes. 

For all domains that we shall encounter, this assumption will be 
obviously satisfied, as for example, polygonal domains. 


Figure 4.26 Decomposition of oriented domain into oriented cells. 


Supplementary Remarks 


Finally, it might be added that the validity of the formula for area 
follows in the same way, even for multiply connected domains, such as 
ring-shaped domains, which can be decomposed into a finite number of 
simple cells. Then all the boundary curves have to be described con- 
sistently in such a sense that the interior of G is always either on the 
“left” side or always on the “right”’ side. 

The formulas for A remain meaningful even when C is not a simple 
curve but is allowed to intersect itself, dividing the plane into more than 
two regions. In this case we may consider the formula as a guide to 
interpreting area suitably as an additive combination of the oriented 
areas of the various connected pieces of the plane bounded by C. 
We shall discuss this matter in Appendix II to this chapter. 


Examples. As an example we can find the area enclosed by the 
ellipse x*/a® + y?|b? = 1. Using the counterclockwise orientation for 


Sec. 4.1 Theory of Plane Curves 371 
the ellipse, we find from the parameter representation 


x=acost, y=bsint for 0<t<2r 
that 


1 2r 1 27 
Ani (ay — yar =1 | ab dt = mab. 
2 Jo 2 Jo 


Area in Polar Coordinates. To express area in polar coordinates 
r and 6, we consider first the area A of the region bounded by a curve 
segment r = f(@) and the radii 6 = « and 6 = 8. We assume that 
a < f and that 0 can be used as parameter along the curve (that is, 
that different points have different polar angles). We use for A the 
expression 


A=! [dy — yda) =} | (xy — yò ds, 


which then has to be extended over the curved part of the boundary 
and over the two radii. On the radii 0 = « and 6 = $ we can use r as 
parameter and find from x = r cos 0, y = r sin 0, and 0 = constant 
that ż = cos 0, y = sin 0, and thus zy — yz = 0. On the curved 
part we use 6 as parameter. Then 


dr dr . 
xz = — cos 0 — r sin 9, j = — sin 0 + r cos 0, 
do i Y= 46 
and thus xy — yz = r*. Consequently, 
1 [° 1 [° 
(21) A= f r° dh = 1 [FO dé. 


For a simple closed curve C which contains the origin in its interior and 
is intersected by every ray from the origin in exactly one point we can 
use 6 as parameter for 0 < 9 < 27 and find for the enclosed area 


0 


1 27 


Formula (21) for area in polar coordinates can also be derived directly 
from the definition of integrals. For that purpose we divide our domain 
into sectors by drawing radii from the origin (Fig. 4.27). Each sector 
is described by inequalities 


O <0 <0,  0<r<SfO. 


Obviously, the area of the sector lies between the areas of the inscribed 
and circumscribed circular sectors; the area of a sector of the domain 
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Figure 4.27 Area in polar coordinates. 


is then equal to }r2(6, — 0,—ı), where r lies between the largest and 
smallest values of f(6) for the interval 0;., < 0 < 0,. As we refine the 
subdivision, the sum of the areas of the sectors of our domain clearly 


1 fe 
converges to the integral 5 Í r? dé. 


Area in a Lemniscate 


As an example of Equation (21) we consider the area bounded by a 
loop of a lemniscate. The equation of the lemniscate (cf. p. 103) is 
r? = 2a? cos 20; one loop is obtained by having 0 vary from —7/4 
to +7/4. This gives us the expression 


7/4 
af cos 20 d0 = a? 
—r/å4 


for the area. Of course, the other loop has the same absolute but 
negative value of area. 


Area Bounded by a Hyperbola 


We now consider the area of a sector bounded by the hyperbola 
x? — y? = 1, which we computed already on p. 234 in a rather cumber- 
some fashion (see Fig. 3.12). For the hyperbola (or rather for its 
right-hand branch) we have the parameter representation x = cosh £, 
y = sinh ¢. We find indeed for twice the area bounded by the hyperbola 
and the radii leading to the points with parameters 0 and t the value 


t t 
2A -Í (xý — yx) dr =| (cosh? r — sinh? 7) dr 
0 0 


t 
-Í dr=t. 
0 


(There is again no contribution to the integral from the radii.) 
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l. Center of Mass and Moment of a Curve 


We now turn to some ideas arising in mechanics. We consider a 
system of n-particles in a plane having the masses m,, M, . . . , m,„ and 
the respective ordinates y,, Y2, . . ., Yn. We then call 


T=> my, = MY, + MY. H't +MY, 


v=] 


the moment of the system of particles with respect to the x-axis. The 
expression n = T/M, where M denotes the total mass m, + m, + 
-++ + m, of the system, defines the height of the center of mass of the 
system of particles above the z-axis, or its ordinate. It is just the 
weighted average Of ¥;,Y2,...,Y, using the “weight factors” m, 
mz,...,m, (see p. 142). Hence 7 is the average height of the masses. 
We define similarly the moment with respect to the y-axis and the 
abscissa of the center of mass. 

We can now easily extend these definitions of the moment to a curve 
along which a mass is uniformly distributed, and thus define the 
coordinates £ and 7 of the center of mass of such a curve. (The assump- 
tion of a constant density, say u, along the curve is not essential: 
Any continuous distribution could be discussed equally well.) 

In a procedure typical for mechanics we start with a system of a 
finite number n of particles, and then pass to a limit for n — œ. For this 
purpose we introduce the length of arc s as a parameter on the curve, 
and subdivide the curve by (n — 1) points of division into arcs of 
lengths As,, As,,..., As,. We represent the mass u As, of each arc 
As, as if it is concentrated at an arbitrary point of the arc, say that with 
the ordinate y,. 

By definition the moment of this system of particles with respect to 
the x-axis is 

T= u È y; ^s,- 


If now the largest of the quantities As, tends to zero, this sum tends 
to a limit given by the integral 


(23) T=n{ yds =n/ ESL 


80 zo 
which is therefore naturally accepted as the definition of the moment 
of the curve with respect to the z-axis. Since the total mass of the curve 
is equal to its length multiplied by y, 


| ds = uls: oa So), 
80 
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we are immediately led to the following expressions for the coordinates 
of the center of mass of the curve: 


81 81 
Í x ds f y ds 
(24) pm > 1 = oe 
Sı — Sp Sı — So 

These statements are actually definitions of the moment and center- 
of-mass of a curve; but they are such straightforward extensions of the 
simpler case of a finite number of particles that we naturally expect 
that—as is actually the case—any statement in mechanics involving the 
center-of-mass or the moment of a system of particles will be valid 
also for continuous mass distribution along curves. 


m. Area and Volume of a Surface of Revolution 
Guldins Rule 


If we rotate a curve y = f(z) for which f(x) > 0, about the z-axis, 
it describes a so-called surface of revolution. The area of this surface, 
whose abscissas we suppose to lie between the bounds x and x, > £o, 
is obtained by a discussion analogous to that above. For if we 
replace the curve by an inscribed polygon, instead of the curved 
surface, we have a figure composed of a number of thin truncated 
cones. Intuition suggests that we should define the area of the surface 
of revolution as the limit of the areas of these conical surfaces when 
the length of the longest side of the inscribed polygon tends to zero. 
From elementary geometry we know that the area of each truncated 
cone is equal to the length of the slanted straight generating side multi- 
plied by the circumference of the circular section of mean radius. 
(Fig. 4.28). If we add these expressions and then carry out the passage 
to the limit, we obtain the expression 


(25) A= an | yds = 2n| yv 1 + y? dx = 2nn(s; — So) 


80 x0 
for the area. Expressed in words, this result states that the area of a 
surface of revolution is equal to the length of the generating curve 
multiplied by the distance traversed by the center of mass (Gu/din’s 
rule). 
In the same way we find that the volume interior to the surface of 
revolution and bounded at the ends by the planes x = x and x = 
xı > Tə is given by the expression 


(26) Vex ade: 


xo 
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Figure 4.28 Area of surface of revolution. 


This formula is obtained by following the intuitive suggestion that the 
volume in question is the limit of the volumes of the earlier mentioned 
figures consisting of truncated cones. The rest of the proof is left to 
the reader. 


n. Moment of Inertia 


In the study of the rotation of an object an important role is played 
by certain quantities called moments of inertia. These expressions will 
be briefly mentioned here. 

We suppose that a particle m at a distance y from the x-axis rotates 
uniformly about that axis with angular velocity œ (that is, in unit time 
it rotates through an angle œ). The kinetic energy of the particle, 
expressed by half the product of the mass and the square of the velocity, 
is obviously 


m 2 
<= a). 
ee) 


We call the coefficient of 4w?, that is, the quantity my’, the moment 
of inertia of the particle about the x-axis. 


Similarly, if we have n-particles with masses m,,m,...,m, and 
ordinates y;, Y2, - - +5 Yn, we call the expression 
T= > my 
1 


the moment of inertia of the system of masses about the x-axis. The 
moment of inertia is a quantity that belongs to the system of masses 
itself, without reference to its state of motion. Its importance lies in the 
fact that under rigid rotation of the system about an axis, without 
change of the distance between pairs of particles, the kinetic energy is 
obtained by multiplying the moment of inertia about that axis by half 
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the square of the angular velocity. Thus the moment of inertia about 
an axis plays the same part in rotation about an axis as is played by the 
mass in rectilinear motion. 

Suppose now that we have an arbitrary curve y = f(x) lying between 
the abscissas 7) and x, >2 9, along which a mass is uniformly dis- 
tributed with unit density. In order to define the moment of inertia of 
this curve we proceed just as in the preceding section, arriving at 
an expression for the moment of inertia about the z-axis, 


81 LB] a rrr roe 
(27) T, -| y? ds -| yl + y”? dex. 
80 zo 
For the moment of inertia about the y-axis we have correspondingly 


(28) TS Í wds = Í ONL + y”? de. 


T0 


4.2 Examples 

From the great variety of plane curves we choose a few typical 
examples to illustrate the concepts discussed. 

a. The Common Cycloid 


From the equations (cf. (1), p. 329) x = a(t — sin t), y = a(l — cos t), 
we obtain ż = a(l — cos t), ¥ = a sin t, and find for the length of arc 


s =| Ve + pdt =| V2a?(1 — cos t) dt. 
0 0 


Since 1 — cost = 2 sin? ¢/2 the integrand is equal to 2a sin ¢/2, and 
hence for 0 < « < 27 


S = 2a | sin (t/2) dt = —4acos a 4a( 1 — cos 2) = 8a sin? Ë, 
0 2 lo 2 4 


If, in particular, we consider the length of arc between two successive 
cusps, we must put « = 27, since the interval 0 < ¢ < 27 of the values 
of the parameter corresponds to one revolution of the rolling circle. 
We thus obtain the value &a; that is, the length of arc of the cycloid 
between successive cusps is equal to four times the diameter of the 
rolling circle. 
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Similarly, we calculate the area bounded by one arch of the cycloid 
and the z-axis: 


27 27 
1=| ye dt = at (1 — cos t} dt 


0 0 
2r 
= af (1 — 2 cos t + cos? t) dt 
0 


= 3a’r. 
0 


= a(t — 2 sin t £ + 2!) 
2 4 
This area is therefore three times the area of the rolling circle. 

For the radius of curvature |p| = I/|«| we have by Eq. (15), p. 355, 


ji ji 


. 
$ 


ee | 
sin - 
2 


at the points t = 0, t = +27,... this expression has the value zero. 
These are actually the cusps, where the cycloid meets the x-axis at right 
angles. 

The area of the surface of revolution formed by rotating an arch of 
the cycloid about the x-axis is given according to our formula (25), 
p. 374, by 


8a 27 
A = 2m | y ds = 2| a(l — cos t): 2a sin < dt 


0 0 £ 


2r t M T 
= gar | sin? 7 dt = 160° | sin? u du 
0 


0 


= 160° | (1 — cos? u) sin u du. 
0 


The last integral can be evaluated by means of the substitution 
cosu =v; we find 


T  64a°r 


0 3 


A= \6a*r(—cos u + = cos u) 


As an exercise the reader may calculate for himself the height 7 
of the center-of-mass of the cycloid above the z-axis and also the 
moment of inertia 7,. The results are 


2a and T,= aL a’. 
27s 15 
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b. The catenary 


The catenary’ is the curve defined by the equation y = cosh x. The 
length of the catenary between the abscissas x = a and z = b is 


Di — b 
s=[ Vi + sinh? x dx -Í cosh x dx = sinh b — sinh a. 


For the area of the surface of revolution obtained by rotating the 
catenary about the z-axis, the so-called catenoid, we find 
> 1 + cosh 22 

2 
= n(b — a + sinh 2b — 4 sinh 2a). 


From this we further obtain the height of the center-of-mass of the arc 
from a to b: 


dx 


a 


b 
A= 2n| cosh? zde = 2n| 


A _b—a+4sinh 2b — 3 sinh 2a 
27s 2(sinh b — sinh a) l 


Finally, for the curvature we have 


c. The Ellipse and the Lemniscate 


The lengths of arc of these two curves cannot be reduced to elemen- 
tary functions, but belong to the class of “elliptic integrals” mentioned 
on p. 299. 


For the ellipse y = (bla) a — xr? we obtain 


1 f fat — (a? — Ba? | E 
=- — ~ dr =a ——— dé, 
: a | a’ — x? 1 — & 
where we have put z/a=é, 1 — b?/a? = ņ?. By the substitution 
é = sin ¢ this integral can be expressed in the form 


s = avi — n? sin? $ dd. 


1 The name derives from the fact that a chain suspended from its ends will assume 
the shape of this curve. Curiously enough the same curve arises in a quite different 
physical application. A soap film, bounded by two circles in space that lie in parallel 
planes and have centers on tke same perpendicular to those planes, has the same 
shape as the surface of revolution obtained by rotating the catenary about the z-axis. 
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Here, to obtain the semiperimeter of the ellipse, we must let 2 traverse 
the intérval from —a to +a, which corresponds to the interval 


-1<é<+l or E o 


For the lemniscate, whose equation in polar coordinates r, ¢ is 
r? = 2a* cos 2t, we similarly obtain 


a 53 
s= [VF hi agli 


cos 2t 


ne Sens? = —2sin?t 


If we introduce u = tan ¢ as an independent variable in the last integral, 
we have 


u? du 


sin? t = 


and consequently, 


In a complete loop of the lemniscate u runs from —1 to +1, and the 
length of arc is therefore equal to 


ay2| — aa 


a special elliptic integral which played a great part in the researches 
of Gauss. 


4.3 Vectors in Two Dimensions 


For the discussions of plane curves and of many other topics in 
geometry, mechanics, and physics, vector notation constitutes a 
convenient and almost indispensable tool. We shall develop and 
apply in this chapter the concept of a vector in two dimensions, leaving 
extensions to higher dimensions to Volume II. 


Intuitive Explanation 


Many mathematical and physical objects are characterized com- 
pletely by a single number, called a “scalar”? since it measures the 
object on a given scale. Examples are angles, lengths, areas, times, 
masses, and temperatures. There are other objects, however, for which 
such a characterization is not possible, for example, the shape of a 
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triangle, the location of a point in space, the acceleration or direction 
of motion of a particle, and the state of tension in a body. Several 
numbers are required to identify each of these objects. Gradually, 
mathematical concepts beyond the continuum of real numbers have 
been developed which permit us to represent such objects by a single 
symbol.’ Vectors in a plane are objects that can be described by 
two items of information: A length and a direction. Of this type are, 
for example, the relative position of two points, the velocity and 
acceleration of a particle, and the force acting on a particle.? 

Geometrically, or intuitively, a vector is essentially a directed straight 
line segment in the plane (or in space), characterized by its length or 
magnitude and by its direction. Ordinarily, vectors are indicated by 
arrows of the given length and pointing in the given direction. Unless a 
restriction is explicitly imposed, the vector is “free”, that is, the 
location of the beginning of the directed line, or arrow, is not an 
inherent part of the specifications for the vector. 

While physical concepts, such as velocity, acceleration, and force, 
are primary instances of vectors in applications, we shall define vectors 
geometrically, by means of “translations” or “parallel displacements.” 

Vector analysis starts simply by giving a name, “vector,” to such 
directed line segments or parallel displacements. However, its decisive 
significance is not that a unifying name was introduced, but that these 
entities, the vectors, (similarly as the complex numbers) can be combined 
with each other or with scalars (that is, ordinary numbers) by a set of 
rules, called vector algebra or vector analysis, in ways that have natural 
interpretations in the various applications, as, for example, the super- 
position of two velocities or the work done by a displacement against a 
force. In the intuitively appealing language of vectors many mathe- 
matical and physical relations can be expressed concisely and clearly. 


a. Definition of Vectors by Translation. Notations 


The simplest type of transformation of the plane is a translation or 
parallel displacement. A translation shifts or maps any point P = (2, y) 
into the point P’ = (2’, y') with coordinates 


vt =r+a, y =y+b, 


1 Of course, complex numbers c: + bi = z are such symbols representing the pair 
of real numbers, a, b; it is indeed sometimes convenient to use complex numbers 
rather than vectors. 

2 Vectors are insufficient for some purposes; to describe, for example, tensions or 
curvature of spaces, more general entities called “tensors” are used. 
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where a and b are constants. The translation is completely determined 
by the ‘constants a and b which we call the components of the trans- 
lation. We shall use the term “vector” as another name for “‘trans- 
lation.” Employing boldface type to denote vectors or translations 
we write R = (a, b) for the vector with components a, b (Fig. 4.29). 


J 


Figure 4.29 The translation z’ = x + 2, y’ = y + 1 corresponding to the vector 
R = PP’ = 00’ = (2,1). 


The components of the vector R are determined by one pair of 
corresponding points P = (x, y) and P’ = (2’, y’), since then 


a=xr — x, b=y —y. 


Clearly, for any points P and P’ it is always possible to find a trans- 


lation R which takes P into P’. We denote it as the vector R = PP’ 
Any ordered pair of points P = (x, y), P’ = (x', y’), that is, any oriented 


line segment, thus determines the vector R = PP’ = (x' — x,y’ — y). 
We observe that a second pair of points Q = (£, n), Q’ = (E, 7’) 
defines the same vector if &’ — ë = x'— x and n — q) = y' — y; 
the same translation R takes then P into P’ and Q into Q’. Vectors R 
are determined by two numbers, the components, just as points are by 
two coordinates in the plane; the basic distinction is that a vector 1S 
represented geometrically by a pair of points. In the representation 
R = PP’ we call P the initial point and P’ the end point. For given R 


one of the points, say the initial point P = (x,y), can be chosen 
arbitrarily; the end point P’ = (z’,y’) is then determined uniquely 
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Figure 4.30 Components a, b and length r of a vector R = PP’. 


by the relations x =x +a, y =y +b. Interchanging initial and 
end point leads to the opposite vector P’P = (—a, —b). 

If we choose for the initial point the origin O = (0,0), we can 
associate uniquely a vector R with every point Q = (x, y) by taking 
R= 00. The vector R with the fixed initial point O is then called the 
position vector of Q. The components of the position of Q are simply 
the coordinates x, y of Q. 

The vector R with components a=0, 6=0 is called the null 
vector and is denoted by O. It corresponds to a translation that leaves 
every point fixed: 

O = (0,0) = PP. 
The distance r of two points P = (x, y), P’ = (x’, y’) depends only 


on the vector R = (a, b) = PP’, since 
r = V(x — r}? + (y — yy = Va + b. 


We call it the /ength of the vector R and write r = |R|. The length of R 
is always a positive number unless R = O (see Fig. 4.30). 

We define the product of a vector R = (a,b) by a number or a 
“scalar” A as the vector 


R* = AR = (da, 2b). 


For 4 = —1 we have in R* = (—a, —b) the vector opposite to R 
(Fig. 4.31). 


Figure 4.31 Scalar multiples of a vector R. 
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If R = PP’ = (a, b) with P = (z, y), P’ = (x', y’), we can represent 

* = AR as PP", where P” = (x", y") = (x + Aa, y + Ab) (see Fig. 
4.32). For a = b = 0 we have, of course, P” = P’ = P. Fora and b 
not both zero the point P” = (x", y") = (x + Aa, y + Ab) traverses 
for varying A the whole line 


xb — ya = xb => ya. 


The value A = 0 gives P” = P, whereas A = 1 gives P” = P’. Thus P” 
lies on the line through P and P’; for A > 0 the points P” and P’ lie on 
the same side of P, for A < 0 they lie on opposite sides. 


Figure 4.32 The vector relation R* = PP” =A PP’ for A = &. 


The two vectors R = (a, b) and R* = (a*, b*) are said to have the 
same direction if R* = AR with a positive A and opposite directions if 
A<0. If R=O, this means that also R* =O. If R # O, the 
necessary and sufficient condition for R* to have the same direction as 
R is that 

a a* b b* 


Å- e aaa 
— 


Jap Vapo Spo Vat OF 


We call the quantities 


which determine the direction of the vector R the direction cosines of R; 
they are, of course, not defined for R = O. Since E + 7? = l, we 
can always find an angle « and a corresponding angle $ = 7/2 — « 
such that 

& = COS a, n = sina = cos $. 
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The angle « is called a direction angle of R (Fig. 4.33). It is determined 


uniquely only to within an even multiple of 7. For R = PP’ we have 


rt — 2 ; '— 
cosa = ; sin a = H , 


r r 


Obviously, « is the angle between the positive x-axis and the line 
from P to P’. More precisely a rotation of the positive z-axis about 
the origin by the angle « (counted positive if we turn counterclockwise, 
negative if clockwise) will give the axis the direction from P to P’. 


P: 


a+ 2r 


—_—> 
Figure 4.33 Direction cosines £, 7, and direction angles for a vector PP’. 


The opposite vector —R = (—a, —b) has direction cosines — and 
—n and direction angles differing from « by an odd multiple of ~. 


If the initial point P of the vector R = PP’ is the origin, the direction 
angle « of R is simply the polar angle 6 of P”. 


b. Addition and Multiplication of Vectors 
Sums of Vectors 


Vectors have been defined by translations, that is as certain mappings 
of points in the plane. There is a perfectly general way of com- 
bining any two mappings by applying them successively. If the first 
mapping carries a point P into the point P’ and the second one carries 
P’ into P”, the combined mapping is the one that carries P into P”. 
In the case of two vectors R = (a, b) and R* = (a*, b*) the vector R 
will map the point P = (x,y) onto the point P’ = (x +a,y + b) 
and R* will map P’ onto P°=(rx+a+a*,y+b+5*). The 
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resulting mapping from P onto P” is again a translation; we call it the 


sum or the resultant of the vectors R = PP’ and R* = P'P”, and denote 
it by R + R* (Fig. 4.34). The components of the sum are a + a* and 
b + b*. Thus our definition of the sum of two vectors is 


PP’ + PP = PP", 
or, if we describe the vectors by their components, 
(a, b) + (a*, b*) = (a + a*, b + b*). 


If R* is taken from the same initial point as R, say R* = PP", the 
points P, P”, P”, and P’ form the vertices of a parallelogram. The 


Figure 4.34 Addition of the vectors PP’ = (a, b) and P’P” = (a*, b*). 


two sides from P represent the vectors R and R*; the sum R + R* is 
represented by the diagonal from P (“parallelogram construction” 
for the sum of vectors). 

Sums of vectors obey the commutative and associative laws of 
arithmetic, since addition of vectors just amounts to addition of 
corresponding components (Fig. 4.35). They obey moreover the 
distributive laws for multiplication of a sum of two vectors by a number À 
and of a vector by the sum of two numbers A, u: 


AR + R*)=AR+AR*, (A+ m)R=AR + wR? 


1 This ‘‘sum”’ is really the “symbolic product” of the two mappings as defined on 
p. 52. The sum notation is here more natural because it corresponds to addition of 
the components. 

2 To distinguish vectors from numbers in an equation we always let the number 
precede the vector in writing products; the combination R2 will not be used, 
although it could be defined by AR = RA. 
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(R + R*) + R** = R + (R* + R**) 


Figure 4.35 Commutative and associate laws of vector addition. 


P’ 


Figure 4.36 PP’ = OP’ — OP. 


—_ —_—> —> _ —> 
Figure 4.37 PQ = PA + AB + BC +°::+ FQ. 


Ch. 4 
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These rules permit us to express a vector PP’ in terms of the position 
vectors OP and OP’ of the points P and P’ (Fig. 4.36): 


PP’ = PO + OP’ = OP’ + PO = OP — OP. 
It is important to realize that generally if we go from a point P to a 
point Q by way of points A, B, C,..., E, F, then the vector PO is the 
sum of the vectors PA, AB, BC,..., EF, FO (Fig. 4.37). 
Angle between Vectors 


The angle 6 formed by a vector R* = (a*, b*) with the vector 
R = (a,b) is defined as the difference of their direction angles: 
6 = a* — «a. (It is assumed here that neither R nor R* is a zero 
vector.) The angle 6 again is determined only to within integer multiples 


Figure 4.38 Angle 0 the vector R* forms with R. 


of 27 (Fig 4.38) A rotation by the angle 6 (with the sign of 0 indicating 
the sense of rotation) will take the direction of R into that of R*. The 
quantities cos 4 and sin 6, which are determined uniquely, can be 
expressed immediately in terms of the direction cosines of R and R*: 


cos 6 = cos (a* — a) = cosa cos a* + sin « sin a* 
aa* + bb* 
= 2 2 2° 
Ja? + byat + b* 
sin 0 = sin(a* — a) = cosa sina* — sina cos a* 
ab* — a*b 
Va? + bat? + b”? 
The denominator in each expression is just the product rr* of the 


length of the vectors. We introduce the expressions occurring in 
the numerators as “products” of the two vectors. 
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Inner Product and Exterior Product of Two Vectors 


We define the “‘scalar” or “inner” or “dot” product of the vectors 
R = (a, b) and R* = (a*, b*) by 


R- R* = aa* + bb* = rr* cos 6, 
and the “outer” or “exterior” or “cross”? product by 
R x R* = ab* — a*b = rr* sin 6. 


As immediately confirmed inner and outer products obey the dis- 
tributive and associative laws: 


R-(R* + R**) =R-R*+R-R**, 
R x (R* + R**) =R x R* +R x R**, 
A(R» R*) = (AR) - R* = R- (AR*), 
A(R x R*) = (AR) x R* = R x (/R*). 
ge 


a 


R*| cos 6 
Figure 4.39 The vector product R x R* = |R| |R*| sin 0 as twice the area of the 
triangle POQ*. 
The commutative law of multiplication also holds for inner products 
R-R* = R*-R; 


for exterior products however, the sign is changed if the factors are 


interchanged: 
R x R* = —R* xR. 


Giving R and R* the same initial point, R = PQ, R* = PQ* we 
can interpret R- R* as the product of the projection r* cos 0 of the 
segment PQ* onto the segment PQ, with the length r of that segment. 
The outer product R x R* is simply twice the area of the oriented 
triangle PQQ*, taken with the positive sign if the vertices POQ* are 
in counterclockwise order, with the negative sign if in clockwise order 
(Fig. 4.39). 


1 With our definition both inner and exterior products are actually “‘scalars.’’ The 
term “‘scalar product’’ is reserved for the inner product because in three dimensions 
the analogue of the exterior product is a vector. 
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For any vector R = (a, b) 
R-R=a+ b = |R|? 


is the square of the length of the vector. Thus R -R is positive unless 
R = O. On the other hand, R x R is always zero. The condition for 
two nonzero vectors to be orthogonal to each other is that R- R* = 0 
while they are parallel (that is, have the same or opposite directions) if 
R x R*¥ = 0. 


Equation of Straight Line 


We can easily write the equation of a line through two points and 
that of a line through a given point with a given direction, in vector 


Figure 4.40 Line in vector notation. 


notation. If P = (x, y), Po = (£o, Yo), and P, = (x1, ¥;) are three points 
with P, # P, then P lies on the line through P, and P, if the vectors 
P,P and P,P, are parallel, that is, 


P,P x P,P, = 0. 


If R = OP, Ry = OP», and R, = OP, are the position vectors of the 
three points, the condition takes the form 


(R — R,) x (R, — R,) = 0 
or 
(R, — R,) x R=R, x Ro. 


Substituting the coordinates of the points for the position vectors, we 
obtain the equation of the line in the usual form (Fig. 4.40): 


(2 — Toy — (Yı — Yot = T1Yo — Yi%o- 
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Instead of prescribing two points of the line we can prescribe one 
point P, and require that the line is to be parallel to a vector S = (a, b). 
Obviously, the equation of the line is then 

(R—R,.) xS=0 
Or 
(x — zo)b — (Y — Yo)a = 0. 
For S = P,P, we obtain the previous equation. 

The distance d of the line from the origin can also be expressed in 

vector notation. Obviously, d multiplied with the length of the vector 


PoP, is twice the area of the triangle OP,P,. Hence 
Do peni eer 
d= (OF, x OP) 22 
[PoP IR, — Rol 
Ae XoY1 — X10 
V(x — xo)" + (4%. — Yo)” 
Here d is taken with the positive sign if the points O, Po, P, follow each 
other in counterclockwise order. 


Coordinate Vectors. A vector R = (a, b) trivially can be represented 
in the form 


(29) R = ai + bj, 
where we denote by i and j the “coordinate vectors” 
(30) i=(1,0), j=(0,1). 


In this way R is split into two vectors ai and bj pointing respectively 
in the direction of the z-axis and y-axis. The components a and b of R 
are just the (signed) lengths of these two vectors. 

In applications one is often called upon to represent a vector R as 
resultant of vectors with two given orthogonal (that is, mutually 
perpendicular) directions. For that purpose it is best to introduce two 
unit vectors (that is, vectors of length 1) I and J with the given directions. 
The required decomposition of R is achieved if we can represent R 
in the form 
(31) R = AI + BJ 


with suitable scalars A, B (cf. Fig. 4.40). It is easy to find the values 
of A and B if such a representation of R exists. For, by assumption, 
the vectors I and J are orthogonal unit vectors of length 1, so that 


(32) I-l=J-J=1, I-J=0. 
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Forming the scalar product of Eq. (31) with I, J respectively we find 
immediately that A and B must have the values 


(33) A=R-I, B=R-J; 


in words, A and B are the (signed) lengths of the projections of the 
segment representing R in the given directions. 

The possibility of writing R as a linear combination (31) of Iand J 
follows from the representation (29) of R in terms of i and j, if we can 


y 


Figure 4.40 


show that i and j themselves can be expressed in terms of I and J. 
However, I = (a, $), J = (y, ô), can be written as, 


(34) I=a&i+ßj J=yit bój. 

Because of (32) the quantities a, p, y, 6 must satisfy the so-called 
orthogonality relations 

(35) X +H fey? + 2? = I, ay + fo = 0. 


If we multiply the first of the equations (34) by 0, the second one by 
f, and subtract we find 


(36) (að — yji = ôI — Bd 
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and similarly, 


(37) (ad — By)j = —yI + oJ. 
Here for the mutually perpendicular unit vectors I and J 
(38) ad — fy =I x J = +l, 


where the upper or lower sign holds depending on the counterclockwise 
or clockwise sense of the 90° rotation that takes I into J. In either case 
formulas (36) and (37) express i and j in terms of I and J; substituting 
these expressions into (29) justifies the representation formula (31) for 
an arbitrary vector R. 

Formula (31) also can be interpreted as the representation of the 
vector R in a new coordinate system with axes pointing respectively 
in the directions of I and J. The components of a unit vector are at the 
same time the direction cosines of the direction angle of the vector. 
Let I and J have direction angles ¢ and y respectively. Then 


a = cos ¢, B = sin @, y = COS Ẹ, ô = sin y. 


Here either y = ġ + 37 or y = ġ — $r. In the first case (which 
corresponds to a right-handed system of coordinate vectors I, J), we 
have y = —f,0 =a, ad — By = +1 so that 


(39) I = (cos ¢, sin 9), J = (— sin ¢, cos ¢). 


The formulas (33) giving the components of R referred to coordinate 
vectors I, J then take the form 


(40) A=acosé+bsind, B= —asin¢d + bcos¢. 


These formulas express the relations between the components of one 
and the same vector R in two right-handed coordinate systems obtained 
one from the other by a rotation of axes by the angle ¢. If we assume 
that the coordinate systems have the same origin O and that R is the 


position vector OP of an arbitrary point P we have in (40) the formulas 
for changes of coordinate systems already derived on p. 361, Equation 
(18). The components a, b and A, B are then respectively the co- 
ordinates of P in the two systems. 


c. Variable Vectors, Their Derivatives, and Integrals 


It is natural to consider vectors R = (a, b) whose components a, b 
are functions of a variable ft, say a = a(t), b = b(t). For any t we then 
have a vector 

R = R(t) = (a(t), dC) 
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and we say that R(r) is a vector function of t. An example is furnished 
by the position vector of a point that moves with the time z. 

We say that R(x) has the limit R* = (a*, b*) for t — t, if a(t) has the 
limit a* and b(t) the limit 6* for t —> tọ. In that case the length of 
R(t) tends toward that of R*, and in case R* Æ O the direction of 
R(t) tends toward that of R* (this means that the direction cosines of R 
tend toward those of R*). The vector R(r) is said to depend contin- 
uously on ¢, if 

lim R(t) = R(t,), 

t>to 
that is, if the components of R are continuous functions of t. The 
length and, if R(t.) Æ O, also the direction of a continuous vector 
vary continuously with £. 

To introduce the derivative of a vector we form for two values ¢ and 
t + h of the parameter the difference quotient 

1 IRG + h) — RO] = [£ + h) — a(t) b(t + h) — 0, 

h h h 
and define the derivative of R as the limit of the difference quotient 
for h -> 0: 


dR... | , da db 
= — =lim-[R(t + h) — R(t)] = (a2 2) = (å, b). 
dt a l ) g dt dt oe 
The derivative of a vector is formed by differentiating the components. 
Derivatives of products of vectors are easily seen to obey the ordinary 
rules 


(RS) -MS aR StR. RSR 
d dt 
(R x sy = ES LR xS4RxS, 


where for outer products, factors have to be taken in the original order. 
We define similarly the integral of the vector R(r) in terms of the 
integrals of its components: 


[Ro dt = (a dr, f (9 at). 


The fundamental theorem of calculus implies 


é f Ro ds = R(t). 
dt Ja 
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d. Application to Plane Curves. Direction, Speed, and Acceleration 
Velocity Vector 

In Section 4.1 we represented a curve C by two functions x = ġ(t) and 
y = y(t). Each ¢ in the domain of these functions determines a point 


P = (x,y) on C; here ¢ may be considered as time and P as a moving 
point whose position at the time ¢ is given by z(t) and y(t). If we identify 


x and y with the components of the position vector R = OP of P, 


OP’ = R(t + At) 


Figure 4.41 Derivative of the position vector for a curve. 


then C is described by the end point of the position vector 
R = R(t) = (2(), y(t)) 


(Fig. 4.41). For two points P and P’ of C corresponding to the parameter 
values ¢ and ¢ + Ar we have in 


PP’ = OP’ — OP = R(t + At) — R(t) = AR 


the vector represented by the directed secant of C with end points P, 
P’. If here At is positive, that is, if the point P’ follows P on C in the 
direction of increasing 7, then the vector 


I 
A (R(t + At) — R(t) 


has the same direction as the vector R(t + At) — R(t) = pp’: its 
length is the distance of the points P and P’ divided by At. For At 
tending to zero we obtain in the limit the vector 


R= R(t) = (ż(t), 4(t)), 
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where again the dot is used to denote differentiation with respect to 
the parameter f. The direction of R is the limit of the direction of the 
secants PP’ and hence is the direction of the tangent at the point P. 
More precisely R points in that direction on the tangent that corre- 
sponds to increasing ¢ on C, provided R # O. The direction cosines of 
R are the quantities 


x yj 
cos a = -= Sin g = ~——— , 
Vit 2 NE + y 
introduced on p. 346 as direction cosines of the tangent. The length of R 
[R| = Va? + y? 


can be interpreted as ds/dt, the rate of change of the length s along the 
curve with respect to the parameter /. If ¢ stands for the time, we have 
in |R] the speed with which the point travels along the curve. 

In mechanics one must consider the velocity of a particle not only as 
having a certain magnitude (the “‘speed’’) but also a certain direction. 
Velocity is then represented by the vector R = (ż, ¥), whose length is 
the speed and whose direction is the instantaneous direction of motion, 
that is, the direction of the tangent in the sense of increasing t. 


Acceleration 


Similarly the acceleration of the particle is defined as the vector 
R = (ë, y). Vanishing acceleration means that # = ¥ = 0; if R=O 
along a whole r-interval, the velocity components have constant values 
¿= a, y = b; the components of the position vector itself are then 
linear functions of t: x = at + c, y = bt +d. The particle in this case 
moves with constant speed along a straight line. 

All our previous results pertaining to curves are easily expressible 
in vector notation if the curve is described by the position vector 
R = R(r) = (x(t), y(t)), with a < t < fp. We find for the length [cf. 


Eq. (8), p. 349] ; 
Í |R| dt, 


while for the signed area enclosed by a curve [cf. Eq. (20), p. 365] 
B 
A= J RxRdt 


(the sign of this quantity depending again on the orientation of the 
curve). Finally, we have for the curvature «x the formula [cf. Eq. (15), 


p. 355] _ Rx 
IR 
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Tangential and Normal Components of Acceleration 


These formulas have interesting implications if we interpret f again as 
the time. Let y be the angle formed by the vector R with the vector R, 
that is, with the instantaneous direction of motion. The quantity 
IR] cos y represents the projection of È onto the direction of R; we 
call it the tangential component of acceleration. Similarly, |R| sin y is 
the projection of R onto the normal (more precisely onto that normal 


Figure 4.42 Tangential and normal acceleration. 


obtained by a 90° counterclockwise rotation from R); this is the normal 
component of acceleration (see Fig. 4.42). By definition of inner and 
outer products 


R-R xR 
|R| cos y = ——., IR] sin y = 
[R| IR] 
Now 
z 1 . zs ld 1 dv’ dv 
Rek=—(Rok PRR RRS Se. 
5 | 5 at | 2 dt dt 


where v = ds/dt = |R| = VR. R is the speed of the point. Hence 
ss dv 

41 R| cos y = — = ù; 

(41) REOSE E 


Thus the tangential component of acceleration is identical with the rate 
of change of speed with respect to time. For the normal acceleration 
the formula for the curvature yields 


(42) IR] sin y = x |R}? = x’, 


that is, the product of the square of the speed with the curvature. 
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For a particle moving with constant speed v along a curve the tan- 
gential acceleration ù vanishes. The acceleration vector then is per- 
pendicular to the curve. More precisely it points toward the “inner” 
side of the curve, the side toward which the curve turns (this is seen, 
for example, from the fact that sin y > 0 when « > 0, that is, when the 
tangent turns counterclockwise). In moving along a curve at constant 
speed therefore, a point experiences an acceleration toward the inside 
of the curve which is proportional to the curvature and also to the 
square of the speed. This fact is of obvious significance because as a 
result of Newton’s law (to be discussed later) a force proportional to the 
acceleration is needed to hold the point P on the curve. 


4.4 Motion of a Particle under Given Forces 


The early development of calculus was decisively stimulated not only 
by geometry but just as much by the concepts of mechanics. Mechanics 
rests on certain basic principles first laid down by Newton; the state- 
ment of these principles involves the concept of the derivative, and their 
application requires the theory of integration. Without analyzing 
Newton’s principles in detail, we shall illustrate by some simple 
examples how calculus is applied in mechanics. 


a. Newton's Law of Motion 


We shall restrict ourselves to the consideration of a single particle, 
that is, of a point at which a mass m is imagined to be concentrated. 
We shall further assume that the motion takes place in the x,y-plane, 
in which the position of the particle at the time f is specified by its 
coordinates x = x(t), y = y(t), or, equivalently, by its “position vector” 
R = R(7) = (2(1), y(1)). A dot above a quantity indicates differentiation 
with respect to the rime t. The velocity and acceleration of the particle 
are then represented by the vectors 

R=(é,7) and R= (ë, ġ). 

In mechanics one relates the motion of a point to the concept of 
forces of definite direction and magnitude acting on the point. A force 
is then also described by a vector F = (p, a). The effect of several 
forces F,, F,,... acting on the same particle is the same as that of a 
single force F, the resultant force, which is simply the vector sum 
F = F, + F, +--- of the individual forces. 

Newton’s fundamental law states: The mass m multiplied by the 
acceleration is equal to the force acting on the particle, in symbols 


(43) mk =F. 
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If we write this vector equation which expresses the fundamental law 
in terms of the components of those vectors, we obtain the equivalent 
pair of equations 

(44) mi=p, mij=a. 

Since acceleration and force differ only by the positive factor m, 
the direction of the acceleration is the same as that of the force. If no 
force acts, that is, F = O, the acceleration vanishes, the velocity is 
constant, and x and y become linear functions of t. This is Newton’s 
first law: A particle on which no force acts moves with constant 
velocity along a straight line. 

Newton's law mR = F is in the first instance nothing more than a 
quantitative definition of the concept of force. The left-hand side of 
this relation can be determined by observation of the motion, by means 
of which we then obtain the force. 

However, Newton's law has a far deeper meaning, due to the fact 
that in many cases we can determine the acting force from other physical 
considerations, without any knowledge of the corresponding motion. 
This fundmental law is then no longer a definition of force, but it 
instead is a relation from which we can hope to determine the motion. 
This decisive turn in using Newton’s law comes into play in all the 
numerous instances where physical considerations permit us to express 
the force F or its components p, o in an explicit way as functions of the 
position and velocity of the particle and of the time z£. The law of 
motion then ts not a tautology, but furnishes two equations expressing 
mx, my in terms of x, y, #, y, and f, the so-called equations of motion. 
These equations are differential equations, that is, relations between 
functions and their derivatives. Solving these differential equations, 
that is, finding all pairs of functions x(7), y(t) for which the equations of 
motion are valid, yields all possible motions of a particle under the 
prescribed force. 


b. Motion of Falling Bodies 


The simplest example of a known force is that of gravity acting on a 
particle near the surface of the earth. It is known from direct obser- 
vation that (aside from effects of air resistance) every falling body has an 
acceleration which is directed vertically downward, and which has the 
same magnitude g for all bodies. Measured in feet per second per 
second, g has the approximate value 32.16.1 If we choose an 


1 The precise value of g, which also includes in addition to gravitational attraction, 
effects of the rotation of the earth, depends on the location on the earth. 
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x,y-coordinate system in which the y-axis points vertically upward while 
the x-axis is horizontal, the acceleration Ë = (#, ý) has the components 


= 0, j = —g. 


By Newton’s fundamental law the vector F representing the force of 
gravity acting on a particle of mass m must then be 


F = (0, —mg). 


This force vector is likewise directed vertically downward; its magni- 
tude, the weight of the body near the surface of the earth, is mg. 

When we cancel out the factor m, the equations of motion of a 
particle under gravity take the form 


From these equations we can easily obtain a description of the most 
general motion possible for a falling body. Integrating with respect 
to t yields 

TSi, y= —gt + b, 


where a and b are constants. A further integration then shows that 
r=at +c, y = —}gP + bt +d, 


where ¢ and d are constants. Thus the general solution of our equations 
of motion depends on four un-specified constants a, b, c, d. We can 
immediately relate the values of these constants for an individual 
motion to the initial conditions for that motion. If the particle at the 
initial time f = 0 is at the point (xo, yo), then setting ¢ = O we find 


CS a. d = Yp. 


The velocity R = (ž, y) = (a, —gt + b) reduces for t = 0 to (a,b). 
Thus (c, d) and (a, b) represent respectively initial position and initial 
velocity of the particle. Any choice of these initial conditions leads 
uniquely to a motion. 

In case a #Æ 0, that is, in case the initial velocity is not vertical, we can 
eliminate f and obtain a nonparametric representation for the orbit 
of the particle. Solving the first equation for f and substituting into 
the second yields 


y = — E E ET, 
2a“ a 


Hence the path is a parabola. For a = 0 we have x = c = constant, 
and the whole motion takes place along a vertical straight line. 
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c. Motion of a Particle Constrained to a Given Curve 


In most problems of mechanics the forces acting on a particle depend 
on the position and velocity of the particle. As a rule, the equations 
of motion are too complicated to permit us to determine all possible 
motions. Considerable.simplification arises if we may consider the curve 
C described by the particle as known and only have to determine the 
motion of the particle along the curve. In a large class of mechanical 
problems the particle is constrained (by means of some mechanical 
device) to move on a given curve C. The simplest example is the plane 
pendulum where a mass m is joined by an inextensible string of length L 
to a point P, and moves under the influence of gravity on a circle of 
radius L about P}. 

Along the curve C we use the arc length s as parameter. The curve is 
then given by x = x(s), y = y(s). Finding the motion of the particle 
along C then amounts to finding s as a function of t. An equation of 
motion along the curve is obtained as follows. 

We form the inner product of both sides of Newton’s formula 
mR = F with a vector &: 

mR-.E=F.-E., 
If we take for & the vector of length | whose direction is that of the 
tangent to C in the sense of increasing s, that is, § = dR/ds, we have 
in F-& = f the tangential component of the force, or the force acting 
in the direction of the motion. According to Equation (41), p. 396,the 
tangential component Ë - & of the acceleration is just dv/dt = d*s/dt?, 
that is, the acceleration of the particle along the curve. Newton’s law 
then yields the formula 
(45) ms = f, 
that is, the mass of the particle multiplied with the acceleration of the 
particle along its path equals the force acting on the particle in the 
direction of motion. 

In applying this equation to a particle constrained to move along C 
we assume that the constraints make no contribution to f) For a 
force F = (p, o) we have then by Equation (44), p. 398, 

dx dy 


(46) f rae ae a 


1 Actually, the mechanism of constraint has to supply a force that holds the particle 
on C (in the simple pendulum this is provided by the tension of the string). We 
assume that this ‘‘reaction’’ force is perpendicular to the curve and thus has no 
tangential component; this would be the case for frictionless sliding of the particle 
along a curve. 
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dx dy 

since the vector £ has the components ee (see p. 394). For a known 
s ds 


ae de dy 
curve C the direction cosines a and T: of the tangent can be considered 
ds ds 


as known functions of s. If likewise the force F = (p, o) depends only 
on the position of the particle, we have in f a known function of s. 
The motion of the particle along C then has to be determined from 
the relatively simple differential equation ms = f(s). 


Figure 4.43 Motion on a given curve under gravity. 


Specifically, for the gravitational force F = (0, —mg) we have 
(46a) f= sme oe 
ds 


thus the equation of motion of a particle constrained to move on a 
curve C under the influence of gravity becomes 


d’s dy 
en dt? E ds 
If « denotes the inclination angle of the curve, we have dy/ds = sin x 
(see Fig. 4.43), and the equation of motion becomes 

d’s ; 

We = —g sin x. 
For a particle constrained to move on a circle of radius L about the 
origin (‘simple pendulum’’) 

x = Lsin 0, y = —L cos 0, 
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Figure 4.44 The simple pendulum. 


where 0 = s/L is the polar angle counted from the downward direction. 
Here (see Fig. 4.44) a = 0 and thus 


d’s dy dé 


= —g—— = —g sinl 
ae Edod os 
or 
2 
Bi (ang 
dt" 


4.5 Free Fall of a Body Resisted by Air 


We start with two examples of the motion of a particle along a 
straight line. We consider only cases where the force acts in the 
direction of the line so that no mechanism of constraint is necessary. 

The path of a body falling freely downward can be described para- 
metrically by x = constant, y = s. If gravity is the only force acting, we 
have the equation of motion 


ms = — img. 


For a particle released at the time ¢ = 0 from the altitude yọ = sy 
with initial velocity v9 (counted positive if upward), we find then by 
integration 

s = — lgl? + rol + So. 


If we wish to take account of the effect of the friction or air resistance 
acting on the particle, we have to consider this as a force whose direction 
is opposite to the direction of motion and concerning which we must 
make definite physical assumptions.’ We shall work out the results of 


1 These assumptions must be chosen to suit the particular physical system under 
consideration; for example, the law of resistance for low speeds is not the same as 
that for high ones (such as bullet velocities). 
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different physical assumptions: (a) The resistance is proportional to 
the velocity, and is given by an expression of the form —rs, where r is a 
positive constant; (b) the resistance is proportional to the square of 
the velocity, and is of the form —rs? for positive § and rs? for negative S. 
In accordance with Newton’s law we obtain the equations of motion 


(a) ms = —mg — Fs, 
(b) më = —meg + rẹ, 


where we have assumed in (b) that the body is falling ($ < 0). If we 
first consider § = v(t) as the function sought, we have 


(a) me = —mg — rt, 
(b) me = -mg + rv’, 


Instead of determining v as a function of ¢ by these equations, we 
determine ¢ as a function of v, writing our differential equations in the 
form 


, dt [ 
(a) a oe 2 ’ 
dv g(1 + kv) 
dt l 
(b) aa 202)? 
dv g(1 — kvv") 


where we have put Vrlmg = k. With the help of the methods given in 
Chapter 3 we can immediately carry out the integrations and obtain 


I 2 
(a) t = — — log (1 + kv) + to 
gk 
| 1— k 
(b) = — log ARa Le. 
2gk 1+ kv 


Solving these equations for r, we have 


(a) v= — al _ eI- to, 
1 1 — e?e to ' 

K am ee aay tanh [gk(1 — to)). 
; = l 


These equations at once reveal an important property of the motion. 
The velocity does not increase with time beyond all bounds, but tends 
to a definite limit depending on the mass m and the constant r (which, 
in turn, depends on the shape of the falling body and the air density). 
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(a) lim y(t) = — ~ = — —, 


(b) ims ee A 
t> a k r 


For the limiting velocities frictional resistance just balances gravi- 
tational attraction. A second integration performed on our expressions 
for v(t) = $, with the help of the methods of Chapter 3, gives the results 
(which may be verified by differentiation) 


— (e tto) ae 1) + c 


1 
(a) s(t) = — p? (t — to) — zk 


(b) s(t) = — A log [cosh gk(t — t))] + c, 
gk” 


where c is a constant of integration. Here čo is the time at which the 
particle would have had velocity 0 and c its altitude at the time fp. 
The two constants c and f, can also be related easily to the velocity and 
position at any other time 7,, if we consider those quantities as initial 
conditions. 


4.6 The Simplest Type of Elastic Vibration—Motion of a Spring 


As a second example—of major significance—we consider the 
motion of a particle which moves along the x-axis and is pulled back 
toward the origin by an elastic force. As regards the elastic force we 
assume that it is always directed toward the origin and that its magni- 
tude is proportional to the distance from the origin. In other words, we 
take the force as equal to — kx, where the coefficient k is a measure of the 
stiffness of the elastic connection. Since k is assumed positive, the 
force is negative when x is positive and positive when x is negative. 
Newton’s law now tells us that 


(48) mł = —kz. 


This differential equation by itself does not determine the motion 
completely, but for a given instant of time, say ¢ = 0, we can arbitrarily 
assign the initial position (0) = x and the initial velocity #(0) = vo; 
that is, in physical language, that we can start off the particle from 
an arbitrary position with an arbitrary velocity; thereafter the motion 
is determined by the differential equation. Mathematically, this is 
expressed by the fact that the general solution of our differential 
equation contains two constants of integration, at first undetermined, 
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whose values we find by means of the initial conditions. This fact we 
shall prove immediately. n 

We can easily state such a solution directly. If we put œ = Vk/m, 
our differential equation becomes d?x/dt? = — w?x. The substitution 
7 = wt for the independent variable reduces this equation to the form 
d?x|dr? = —x, discussed in Chapter 3, p. 312. Thus our differential 
equation is satisfied by all the functions 


x(t) = c cos ot + cy sin ot, 


which may also be verified at once by differentiation (where c, and c, 
denote constants chosen arbitrarily). In Chapter 3, p. 313, we saw 
that there are no other solutions of our differential equation and hence 
that every such motion under the influence of an elastic force is given 
by this expression. This can easily be put in the form 


x(t) = a sin (t — 6) = —a sin w ó cos wt + a cos w ô sin ot; 


we need only write —a sin œ ò = c, and a cos œ 0 = c,, thus intro- 
ducing instead of c, and c, the new constants a and 0. Motions of this 
type are said to be sinusoidal or simple harmonic. They are periodic; 
any state [that is, position x(t) and velocity #(1)] is repeated after the 
time T = 2z/m, which is called the period, since the functions sin wt 
and cos wf have the period T. The number a is called the maximum 
displacement or amplitude of the oscillation. The number I/7 = «/27 
is called the frequency of the oscillation; it measures the number of 
oscillations per unit time. We shall return to the theory of oscillations 
in Chapter 8. 


*4.7 Motion on a Given Curve 


a. The Differential Equation and Its Solution 


We now turn to the general form of the problem of motion along a 
given curve under an arbitrary preassigned force mf(s). We shall deter- 
mine the function s(t) as a function of t by means of the differential 
equation [Eq. (45), p. 400] 

§ = f(s), 
where f(s) is a given function.’ This differential equation in s can be 
solved completely by the following device. 


1 Our original equation of motion along a curve was ms = f(s); we can, however, 
always write the function /(s) in the form m/f(s), obtaining the simpler form of the 
equation used here. 
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We consider any primitive function F(s) of f(s), so that F’(s) = f(s), 
and multiply both sides of the equation § = f(s) = F'(s) by $. We can 
then write the left-hand side in the form d(s?/2)/dt, as we see at once by 
differentiating the expression $°; the right-hand side F’(s)s, however, 
by the chain rule of differentiation is the derivative of F(s) with respect 
to the time ¢, if in F(s) we regard the quantity s as a function of t. 
Hence we immediately have 


tle) = 4M, 
dt\2 dt 
or by integration 


gs" = F(s) + c, 


where c denotes a constant yet to be determined. 

We have now arrived at an equation which only involves the function 
s(t) and its first derivative. (Later on we shall interpret this equation as 
expressing the conservation of energy during the motion.) Let us write 
this equation in the form ds/dt = J 2[F(s) +c]. We see that from 
this we cannot immediately find s as a function of f by integration. 
However, we arrive at a solution of the problem if we at first content 
ourselves with finding the inverse function /(s), that is, the time taken 
by the particle to reach a definite position s. For t(s) we have the 
equation 

dt 1 


ds J 2[F(s) +c] 


thus the derivative of the function f(s) is known, and we have 


3 


t - | Cys 
V2[F(s) + c] 


where c, is another constant of integration. As soon as we have 
performed this last integration we have solved the problem, for although 
we have not determined the position s as a function of 7, we have 
inversely found the time ¢ as a function of the position s. The fact 
that the two constants of integration c and c, are still available enables 
us to make the general solution fit special initial conditions. 

The general discussion can be illustrated by our earlier example of 


elastic vibrations if we identify x with s; here f(s) = —m*s and corre- 
spondingly, say, F(s) = — 4*s®?. We therefore obtain 
dt 1 


3 


SS 
ds Ve — ws? 
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and furthermore, 


ds 
oe a a 
t 9 9 
yV 2c — ws” 


This integral, however, can easily be evaluated by introducing cos] 2c 
as a new variable: we thus obtain 


1 . ws 
t = — arc sin —= + c, 
w a2 


or, forming the inverse function, 


s= we sin w(t — c). 
(1) 
We are thus led to exactly the same formula for the solution as before. 
From this example we also see what the constants of integration 
mean and how they are to be determined. If, for example, we require 
that at the time ¢ = 0 the particle shall be at the point s = 0 and at that 
instant shall have the velocity s(0) = 1, we obtain the two equations 


V2e. m 
0= SIn MC), l= J2¢ COS WC), 
00) 


from which we find that the constants have the values c, = 0, c = }. 
The constants of integration c and c, can be determined in exactly the 
same way when the initial position sy and the initial velocity sy (at 
time ¢ = 0) are prescribed arbitrarily. 


b. Particle Sliding down a Curve 


The case of a particle sliding down a frictionless curve under the 
influence of gravity can be treated very simply by the method just 
described. We found already on p. 401 the equation of motion corre- 
sponding to this case: 


where dots indicate differentiation with respect to the time ¢. The 
right-hand side of this equation is a known function of s, since we know 
the curve and we can therefore regard the quantities x and y as known 
functions of s. 

As in the last section, we multiply both sides of this equation by $. 
The left-hand side then becomes the derivative of 55? with respect to z. 
If in the function y(s) we regard s as a function of ¢, the right-hand side 
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of our equation is the derivative of —gy with respect to t. On inte- 
grating, we therefore have 


$5? == —8gy A C, 


where c is a constant of integration. To find the interpretation of this 
constant, we suppose that at the time ¢ = 0 our particle is at the point 
of the curve for which the coordinates are x, and y, and that at this 
instant its velocity is zero, that is, s(0) = 0. Then putting ¢ = 0 we 
immediately have —gy, + c = 0, so that 


35? = g(y, — Y). 


Since $° could never be negative, we see that the altitude y of the particle 
never exceeds the value yo, and only reaches it at those instants when the 
velocity of the particle is zero. The velocity is larger as the particle is 
lower. Now instead of regarding s as a function of t we shall consider 
the inverse function f(s). For this we at once obtain 


dt 1 


— + U__——————— 3 
ds /2g(yo — Y) 
which is equivalent to 


t = Cy + |—4 ; 
V 28l — y) 

where c, is a new constant of integration. As regards the sign of the 
square root, which is the same as the sign of Ss, we notice that if the 
particle moves along an arc which is lower than yọ everywhere except 
at the ends, the sign cannot change. For the sign of $ can change only 
where $ = 0, that is, where y — yọ = 0. Thus the particle can only 
“turn back” at points of maximum elevation yọ, on the curve. Instead 
of the arc length s the curve can also be referred to any parameter 0, 
so that x = $(6), y = (9). Introducing 0 as independent variable, we 
obtain 


12 12 
= at] 4 ds dO = Í J aity" ap 
d0 /2¢(4> — Y) 2g(Yo — Y) 
where the functions x’ = 4'(8), y’ = y'(0), and y = (0) are known. 
In order to determine the constant of integration c, we note that for 
t = 0 the parameter 0 will have a value 0). This immediately gives us 
our solution in the form 


(49) = s) E a 
28(Yo — y) 
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We see that this equation represents the time taken by the particle to 
move from the parameter value 0, to the parameter value 6. The 
inverse function O(t) of this function #(@) enables us to describe the 
motion completely; for at each instant t we can determine the point 
x = ¢[6(1)], y = y[6(0)] which the particle is then passing. 


c. Discussion of the Motion 


From the equations just found, even without an explicit expression 
for the result of the integration we can deduce the general nature of the 
motion by simple intuitive reasoning. We suppose that our curve is of 


Figure 4.45 


the type shown in Fig. 4.45, that is, that it consists of an arc convex 
downward; we take s as increasing from left to right. If we initially 
release the particle at the point A with coordinates xo = $ (00), Yo = Y(4), 
corresponding to 6 = Oo, the velocity increases, for the acceleration 
§ is positive. The particle travels from A to the lowest point with 
ever-increasing velocity. After the lowest point is passed, however, the 
acceleration is negative, since the right-hand side —g dy/ds of the 
equation of motion is negative. The velocity therefore decreases. From 
the equation $ = 2g(yọ — y) we see at once that the velocity reaches 
the value zero when the particle reaches the point B whose height is the 
same as that of the initial position A. Since the acceleration is still 
negative, the motion of the particle must be reversed at this point, 
so that the particle will swing back to the point A; this action will 
repeat itself indefinitely. (The reader will recall that friction has been 
disregarded.) In this oscillatory motion the time which the point takes 
to return from B to A must clearly be the same as the time taken to 
move from A to B, since at equal heights we have equal values of |s|. If 
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we denote the time required for a complete journey from A to B and 
back again by T, the motion will obviously be periodic with period T. 
If 0, and 6, are the values of the parameter corresponding to the points 
A and B, respectively, the half-period is given by the expression 


PA I ea | 
=ne 1m —— df 
(50) 2 V2gl Ja Y yy—y 
Jz pI) — WO) | 


If 0, is the value of the parameter corresponding to the lowest point of 
the curve, the time which the particle takes to fall from A to this lowest 
point is 


ci at ETIN] 


i 


d. The Ordinary Pendulum 


The simplest example is given by the so-called simple pendulum. 
Here the curve under consideration is a circle of fixed radius L: 
x= Lsin 0, y = —L cos 8, 


where the angle 0 is measured in the positive sense from the position 
of rest. From the general expression (50) we at once obtain using the 
addition theorem for the cosine, 


r= [= e Ss J sla Tega 29” 


n“ 


where 4 (0 < 6) < 7) denotes the amplitude of oscillation of the 
pendulum, that is, the angular position from which the particle is 
released at time ¢ = 0 with velocity zero.’ By the substitution 
_ sin (8/2) du _ cos (0/2) 
sin (9/2) dð  2sin(6,/2) 


our expression for the period of oscillation of the pendulum becomes 


h ae 


1 We have assumed here that the velocity does become equal to zero at some time 
during the motion. This excludes tne type of tumbling motion of the pendulum in 
which @ is not periodic and varies monotonically for all r. 
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We have therefore expressed the period of oscillation of the pendulum 
by an elliptic integral (see p. 299). 

If we assume that the amplitude of the oscillation is small, so that we 
may with sufficient accuracy replace the second factor under the 
square root sign by 1, we obtain the expression 


7 m 
2E f a 
Psat =n? 


as an approximation for the period of oscillation. We can evaluate 
this last integral by formula 13 in our table of integrals (p. 263) and 
obtain the expression 2m Lie as an approximate value for T. To this 
order of approximation the period is independent of 0», that is, of the 
amplitude of the oscillation of the pendulum. Clearly, the exact 
period is larger and increases with 0». Since in the interval of inte- 
gration 
i>1- PS | — jA = sau ei, 
2 2 2 


pa pa 


we find for the period the estimates 


mm JE < r<—1 oy [E. 
£ cos (4/2) & 


For angles 0) < 10° we have 1/(cos 65/2) < sec 5° < 1.004, so that the 


period will be given by the formula 27y Lie with a relative error of less 
than $°>. For finer approximation of the elliptic integral for T see 
Section 7.6f. 


e. The Cycloidal Pendulum 


The fact that the period of oscillation of the ordinary pendulum is not 
strictly independent of the amplitude of oscillation caused Christian 
Huygens, in his prolonged efforts to construct accurate clocks, to seek a 
curve C for which the period of oscillation is independent of the position 
on C at which the oscillating particle begins its motion.’ Huygens 
recognized that the cycloid is such a curve. 

In order that a particle may actually be able to oscillate on a cycloid 
the cusps of the cycloid must point in the direction opposite to that of 
the force of gravity; that is, we must rotate the cycloid considered 
previously (p. 328) about the x-axis (cf. Fig. 4.2, p. 329). We therefore 


1 The oscillations are then said to be isochronous. 
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write the equations of the cycloid in the form 
x=a(O+7-+sin 8), 
= —a(l + cos 9), 


which also involves a change of the parameter ¢ into 6 + 7 (Fig. 4.46). 


y 


Figure 4.46 Path described by a cycloidal pendulum. 


The time which the particle takes to travel from a point at the height 
Yo = —a(l + cos 65) (0 < 4, < 7) 


down to the lowest point, and up again to the height yọ, by formula 
(50) of p. 410, is 


T fe j fae r? hy” Cdp = EF Cos = cos(0/2)) ) do 
2g J—0 -0 ,/cos 6 — cos 4% 


Using exactly the same substitutions as for the period of the simple 
pendulum, we arrive at the integral 


T tc f du 
r E pe ae 
2 gJavi—w 
and we therefore obtain 
T= "E . 
8 


The period of oscillation T, therefore, is indeed independent of the 
amplitude 4). A simple way of actually constraining a particle by a 
string to move on a cycloid will be described on p. 428. 
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*4.8 Motion in a Gravitational Field 


As an example of unconstrained motion we consider a particle 
moving in the gravitational field of an attracting mass. 


a. Newton’s Universal Law of Gravitation 


Kepler's description of the motion of the planets, which was based on 
the precise observations of Tycho Brahe, led Newton to formulate his 
general law for the gravitational attraction between any two particles. 
Let Py = (£o, Yo) and P = (x, y) be two particles of masses mọ and m, 
respectively. Let r = V(a — Xo)? + (Y — Yo} be the distance between 
the particles. Then Py exerts on P a force F which has the direction of 
PP, and the magnitude |F| = ympm/r?, where y is the ‘universal 
gravitational constant.” Since F can then only differ by a positive 


factor from the vector PP,, which itself has magnitude r, we must have 


_ ymm -> ymym(X — x) ymomlYo — y) 


r 

This law of attraction refers to particles, that is, to bodies that can be 
considered to be concentrated in points, neglecting the actual extent of 
the bodies (Fig. 4.47). The validity of such an assumption is plausible 
enough for celestial bodies whose mutual distances are tremendous 
when compared with their diameters. Newton vastly increased the 
range of application of this law by showing that the same law of attraction 
also describes the attraction of a body of mass m, of considerable extent 
on a particle of mass m, provided that the body is a sphere of constant 
density, or, more generally, provided that the body is made up of 
concentric spherical shells of constant density; in that case the attrac- 
tion of the body on a particle P located outside the body is the same as 
if the total mass mọ of the body were located at its center P, (Fig. 4.47). 
The earth can with fair accuracy be thought of as made up of concentric 
shells of constant density, so that the attraction of the earth on a 
particle of mass m on its surface is directed toward the center P, of the 
earth (that is, vertically downward for an observer) and has magnitude 
ym ym/R?, where R is the radius of the earth and mọ its mass. We can 
identify then ymọm/ R? with mg, where g is the gravitational acceleration 
(see p. 398). In other words, we have g = ymp/R?. 

From Newton’s fundamental law we find for a particle P of mass m 
moving under the influence of the attraction of a mass mọ located at Po 
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the equations of motion 


ee yiriol¥o — 2) ge ymolYo = y) 
r r 


We now make the further simplifying assumption that mọ is so much 
larger than m that the effects of the attraction of P on P, can be neglected 


m 
ymgm 

| 

| 

| 

| 

| 

l 

l 

| 

| 
ò 


(b) 


Figure 4.47 (a) Newtonian attraction of two particles. (b) Gravitational attrac- 
tion of the earth. 


and Py can be considered at rest. This would, for example, be the 
situation for a pair of bodies like the sun and a planet or the earth and 
a body on its surface. Taking the origin of coordinates at Py we then 
have for P = (x, y) the equations of motion 


(51) Ë SLIS VIN ye yj —_ Se ymoy 


© 9 
r” r? 


with r = Vz? + y?. 
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b. Circular Motion about the Center of Attraction 


We shall not attempt to find the most general solution of these 
differential equations (which, as is well known, would correspond to 
motion along a path of the form of a conic section, with one focus at 
the attracting center). Instead, we shall just consider the simplest types 
of motion consistent with these equations, namely, uniform circular 
motions about the origin and motions along a radius from the origin. 
For uniform circular motion of P along a circle of radius a about the 
origin we have r = a and 


x = a Cos ot, y =asin wt, 


where w is a constant. The period T of the motion, that is, the time 
after which P returns to the same position, is T = 27/«. We find for 
the velocity components 


x = —aw sin wt, y = aw cos wt 


so that the speed of P in its orbit is 


(52) v = Vv $? + j” = Ay) = ra a 


The acceleration of P has the components 
ï = —aw’ cos wf = —m*x, ğ = —aw? sin wt = —?y. 


Clearly, the equations of motion (51) are then satisfied if 


ow? = Ymo 
T 3 
or 
3 YmMo YMo +2 
(53) a = 3 = — T“. 
ap) 4r“ 


This is just Kepler’s third law for the special case of circular motion, 
according to which the cubes of the distances of the planets from the 
sun are proportional to the squares of their periods. 

We can give some simple illustrations of Kepler’s law for the case 
where the attracting body is the earth with its mass mọ and radius R. 
Observing that here ym = gR? we have 

a= gR T: 
4r? 
For a satellite circling the earth at tree-top level (neglecting, of course, 
air resistance) we have a = R ~ 3963 miles. We find then from our 


416 Applications in Physics and Geometry Ch. 4 


formula for the period of the satellite the value 


T= 2r fÈ ~ 1.4 hours 
g 


and for its velocity in its orbit 


(54) v= am = ,/Rg ~ 27,000 feet per second. 


We can compare the value of 7 for the satellite circling the earth 
with the period of 27.32 days of the moon, that is, the time after which 
the moon returns to the same position among the stars (‘‘sidereal 
month”). By Kepler’s law the ratio of the distance a of the moon to the 
radius R of the earth should be given by the §-power of the ratio of 
the periods. This leads for the distance of the moon from the center of 
the earth to the value 


(A= x 24 
C= 


23 
4 ) R ~ 60R ~ 240,000 miles, 


which agrees well with the actual average value of the distance. 


c. Radial Motion—Escape Velocity 


The second type of motion we shall consider is that of a particle 
moving from the center of attraction along a ray, say the x-axis. Here 
y = 0, x = r, so that the equations of motion reduce to 


ymo 


x 


i= 


Following our general procedure for equations of the type 5 = f(s), 
we multiply both sides of this equation with ż and obtain 


os ar 
tt = —ymg > 
r 
or 
a(t e Ler) 
dt\2 | dt\ x 
Thus the expression 
2 YM" 
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has a constant value h during the motion. (Later on we shall recognize 
this fact as an instance of the law of conservation of energy.) If we 
introduce xv instead of ¢ as independent variable, we have then 


a 
dx ù [2h + (2yme/x) 


which by integration leads to 


x dé 
zo \/2h + (2ymy/E) 


We shall not bother to carry out the integration which can be performed 
easily with the help of the methods developed in Chapter 3. Fora particle 
released at the time fy = 0 at the distance x, with initial velocity zero we 
have h = —ymp/%. The time required for such a particle to fall into 
the attracting particle (x = 0) is then 


t=t+ 


dé _7 Ea 


t = [ -m = ; 
o /2yvmo(1/E — 1[zo) 2^ 2ymy 


By Kepler’s law this is Vd times the time it would take the particle 
to circle the center of attraction at the distance x, [see Eq. (53), 
p. 415]. 

The relation 


1 y i P h 
2 x 

has an interesting consequence when we investigate the circumstances 
under which a particle can escape to infinity. Since 42? > 0 we find for 
x— oo that the constant A must be nonnegative, and hence that 
322 — ym,/x > 0 during the whole motion. In particular, a particle 
starting at the distance x = a with velocity v can escape to infinity 
only if 4v? — ym,/a > 0. The lowest possible value of the velocity v 
which will permit a particle to escape to infinity is then v = J 2ym)/a. 
This is the escape velocity v,. For a particle starting at the surface of the 
earth and escaping to infinity, that is, escaping its gravitational pull, 
we have a = R, ym = gR’, so that 


Ve = J 2gR ~ 37,000 feet per second. 


Hence [cf. (54), p. 416] the escape velocity is just V2 times the velocity 
needed to maintain a satellite in a circular orbit near the earth. A 
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meteor falling from infinity onto the earth also would have velocity 
v, on impact, if we neglect air resistance and motion of the earth in its 
orbit. 


4.9 Work and Energy 
a. Work Done by Forces during a Motion 


The concept of work throws new light on the considerations of the 
last section and on many other questions of mechanics and physics. 

Let us again think of the particle as moving on a curve under the 
influence of a force acting along the curve, and let us suppose that its 
position is specified by the length of arc measured from any fixed 
initial point. The force acting in the direction of motion itself will 
then, as a rule, be a function of s. This function will have positive 
values where the direction of the force 1s the same as the direction of 
increasing values of s and negative values where the direction of the 
force is opposite to that of increasing values of s. 

If the magnitude of the force is constant along the path, we mean by 
the work done by the force the product of the force by the distance 
(si — So) traversed, where s, denotes the final point and sọ the initial 
point of the motion. If the force is not constant, we define the work by 
means of a limiting process. We subdivide the interval from sọ to s 
into n equal or unequal subintervals and notice that if the subintervals 
are small, the force in each one is nearly constant; if ø, is a point chosen 
arbitrarily in the yth subinterval, then throughout this subinterval the 
force will be approximately f(c,). If the force throughout the vth sub- 
interval were exactly f(o,), the work done by our force would be 
exactly 


È So) As, 


where As, as usual denotes the length of the »th subinterval. If we now 
pass to the limit, letting n increase beyond all bounds while the length 
of the longest subinterval tends to zero, then by the definition of an 
integral our sum will tend to 


W= Í “ds 


which we naturally call the work done by the force. 

If the direction of the force and that of the motion are the same, the 
work done by the force is positive; we then say that the force does 
work. On the other hand, if the direction of the force and that of the 


Sec. 4.9 Work and Energy 419 


motion are opposed, the work done by the force is negative; we then 
say that work is done against the force. 

If we regard the coordinate of position s as a function of the time z, 
so that the force f(s) = p is also a function of t, then in a plane with 
rectangular coordinates s and p we can plot the point with coordinates 
s = s(t), p = p(t) as a function of the time. This point will describe a 
curve, which may be called the work diagram of the motion. If we are 
dealing with a periodic motion, as in any machine, then after a certain 
time 7 (one period) the moving point (s(t), p(t)) must return to the same 
point; that is, the work diagram will be a closed curve. In this case the 
curve may consist simply of one and the same arc, traversed first 
forward and then backward; this happens, for instance, in elastic 
oscillations. However, it is also possible for the curve to be a more 
general closed curve, enclosing an area; this is the case, for example, 
with machines in which the pressure on a piston is not the same during 
the forward stroke as during the backward stroke. The work done in 
one cycle, that is, in time T, will then be given simply by the negative 
of the area of the work diagram or, in other words, by the integral 


toi T l ds 
(t)— dt, 
| j F 


0 


e 


where the interval of time from te to tọ + T represents exactly one period 
of the motion. If the boundary of the area is positively traversed, the 
work done is negative, if negatively traversed, the work done is positive. 
If the curve consists of several loops, some traversed positively and some 
traversed negatively, the work done is given by the sum of the areas of 
loops, each with its sign changed. 

These considerations are illustrated in practice by the indicator 
diagram of an old-fashioned steam engine. By a suitably designed 
mechanical device a pencil is made to move over a sheet of paper; the 
horizontal motion of the pencil relative to the paper is proportional to 
the distance s of the piston from its extreme position, whereas the 
vertical motion is proportional to the steam pressure, and hence 
proportional to the total force p of the steam on the piston. The 
piston therefore describes the work diagram for the engine on a known 
scale. The area of this diagram is measured (usually by means of a 
planimeter), and the work done by the steam on the piston is thus found. 


1 Note that here we must carefully characterize the force of which we are speaking. 
For example, in lifting a weight the work done by the force of gravity is negative: 
Work is done against gravity. But from the point of view of the person doing the 
lifting the work done is positive, for the person must exert a force opposed to gravity. 
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Here we also see that our convention for the sign of an area, as discussed 
on p. 365 is definitely of practical interest. For it sometimes 
happens when an engine is running light, that the highly expanded 
steam at the end of the stroke has a pressure lower than that required 
to expel it on the return stroke; on the diagram this is shown by a 
positively traversed loop; the engine itself is drawing energy from the 
flywheel instead of furnishing energy. 


b. Work and Kinetic Energy. Conservation of Energy 


The law of motion 
mš = f 


leads to a fundamental relation between the changes in velocity during 
the motion of a particle along a curve and the work done by the force 
J in the direction of motion. We apply the same device used already 
several times in the preceding examples and multiply both sides of the 
equation of motion by $: 

mss = f(s). 


Now mss = (d/dt)}ms? = (d/dt)kmv?, where v(t) = § is the velocity of 
the particle. Integrating both sides of the equation with respect to z 
between the limits f, and ¢,, we find 


1 2 1 20 f \ 4s 
= t) — — ta) = — dt 
5 m? ) ; mv*(to) g f(s) 7 


=|" 1 ds = W. 


The quantity $v? is called the kinetic energy K of the particle. Hence: 
The change in kinetic energy of a particle during the motion equals the 
work done by the force acting on the particle in the direction of motion. 

The quantity f represented the force acting in the direction of motion 
or the tangential component of force. For a force F = (p, o) the force 
in the direction of motion is 

rar ee a ee 
ds ds ds 

If p and o are known functions of x and y and if the particle is known to 
move along a curve x = 2(s), y = y(s), then f also becomes a known 
function of s. Hence in order to compute the work 


(55) W= f Tds 
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as the particle moves from one position (£o, yg) to another (2, y,), 
we have to know in general the path along which the particle moves. 

In an important class of cases the work W depends only on initial 
and final position and can be expressed in the form 


(56) W = V(x, Yo) — V(r, 91) 


with a suitable function V(x, y) the potential energy. The formula 
expressing that the change in kinetic energy equals the work done by 
the force then can also be written in the form 


(57) dmv7(t,) + V(x, Y1) = meto) + V(x, Yo). 


Thus the quantity K + V, the sum of kinetic and potential mechanical 
energy, that is, the total energy, does not change during the motion. 
This is an instance of the general physical law of conservation of 
energy. 

A potential energy function V can easily be constructed in some of the 
motions discussed earlier. Thus for a particle subject to gravity we have 
F = (0, —mg) and f= —me(dy/ds). The work done by the force of 
gravity as the particle moves from a position (xo, Yọ) to a position 
(xi, y,) is then 


V1 

W = k -mg ds = =| —mg dy = MZY — mgy,. 
vo 

We see that W is proportional to the change in altitude between initial 

and end position. For the potential energy function V we can choose 

V = mgy (or more generally V = mgy + c, where c is any constant). 

The law of conservation of energy then states that the quantity 


Bu” + gy 


is constant during the motion. We had noticed this fact already in 
investigating the motion of a particle sliding down a curve (p. 408). 


c. The Mutual Attraction of Two Masses 


Another example of a force with which we can associate a potential 
energy function V is furnished by the gravitational attraction F exerted 
by a particle Py = (zo, Yo) of mass my on a particle P = (x, y) of mass m. 
Here 


’ 
r? r? 


F= [=E — zo) —uly — w, 
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where u = ymm and r = V(x — 2)? + (Y — Y). (According to 
Coulomb’s law the same type of formula gives the interaction of two 
electric charges.) 

The force in the direction of motion is then 


u dx A UA d u 
= = — — Lar) m — —— | =E= — _ — m n Å— 
f fle o) LTO 0) ds reds dsr 
since 
dx dy ld 
X— ta) — + (Y — Y) — = -— [(*@ — r) + — 
( 0) (Y — Yo) er [( oo + (Y — Y] 
1 dr’ dr 
=-—=r—., 
2 ds ds 


The work done by the force of attraction when the particle P moves 
from a position (%,, y,) to the position (7, Y2) is then 


W = f (24) ee ae V(x, Y1) — V(2%q, Ye), 
ds r. 


Fo Fi 


where V(x, y) = —ujr = — pV (x — £o) + (Y — Yo)? is the potential 
energy. 

If we move the particle from the position (2,, y,) to infinity (corre- 
sponding to ra = œ), the work done by the force of attraction is 
—p/r,. The work done by an opposing force that moves the particle 
to infinity has the same numerical value but the opposite sign. Hence 
alr, = —V(x,, y,) is the work that has to be done against the force of 
attraction in order to move the particle to infinity from the position 
(x,,¥,). This important expression is called the mutual potential of 
the two particles. Therefore here the potential is defined as the work 
required to separate the two attracting masses completely, for example, 
the work required in order to tear an electron completely away from its 
atom (ionization potential). 

If the attracting mass Po is considered as fixed, the law of conser- 
vation of energy implies that the attracted particle P moves in such a way 
that the expression 

le-o h 

2 r 
(the total energy per unit mass mm) has a constant value during the 
motion. We had derived this fact already for the special case of purely 
radial motion; we see now that it holds for any type of motion under 
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the influence of gravitational attraction. We can conclude again that 
h > 0 for a particlé escaping to infinity; its orbit is then unbounded 
(parabola or hyperbola) instead of bounded (ellipse). The escape 


velocity 
2 
Ve = [ume , 
r 


which corresponds to h = 0, is the least velocity which enables the 
particle to escape to infinity from a given distance r. It does not 
depend on the direction in which the particle is released but only on 
the distance r from the attracting center. 


d. The Stretching of a Spring 


As a third example we consider the work done in stretching a spring. 
Under the assumptions on the elastic properties of the spring made on 
p. 404, the force acting is f = —kxr, where k is constant. The work that 
must be done against this force in order to stretch the spring from the 
unstretched position x = 0 to the final position x = 2, is therefore 
given by the integral 


*e. The Charging of a Condenser 


The concept of work in other branches of physics can be treated in a 
similar way. For example, let us consider the charging of a condenser. 
If we denote the quantity of electricity in the condenser by Q, its 
capacity by C, and the difference of potential (voltage) across the 
condenser by V, then we know from physics that Q = CV. Moreover, 
the work done in moving a charge Q through a difference of potential 
V is equal to QV. Since in the charging of the condenser the difference 
of potential V is not constant but increases with Q, we perform a 
passage to the limit exactly analogous to that on p. 418, and as the 
expression for the work done in charging the condenser we obtain 


Qı 1 (°° 1 2 1 
Í Vdo =- Qdo = 121 = to, 
0 C Jo 2 
where Q, is the total quantity of electricity passed into the condenser 


and V, is the difference of potential across the condenser at the end of 
charging process. 
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Appendix 


*A.1 Properties of the Evolute 


On p. 359 we defined the evolute E of a curve C as the locus of the 
centers of curvature of C. If C is represented by: x = æ(s), y = y(s), 
using the arc length s as parameter, then the center of curvature (&, 7) 
of the point C with parameter s is given by [cf. (17a), p. 359] 


(58) E=x— py, n=yt pt, 
with 


The quantities « and |p| are, respectively, curvature and radius of 
curvature of C. 

We can deduce some interesting geometrical properties of the 
evolute from these formulas. 

Differentiating the relation ž? + y2=1 leads to, zë + yy = 0. 
Since also tý — jë = 1/p, we have 


4 : eee ae 

(59) #=—-y, yuk. 
p p 
Differentiating the formulas (58) with respect to s 
S=t— pý — py=—fy, =H + pë + pt = pt, 
and therefore 
Ei + hj = 0. 

Since the direction cosines of the normal to the curve are given by 
—y and #, the normal to the curve C is tangent to the evolute E at the 
center of curvature; or the tangent to the evolute is the normal of 
the given curve; or the evolute is the “envelope” of the normals (cf. 
Fig. A.1). 


If further we denote the length of arc of the evolute, measured from 
an arbitrary fixed point, by o, we have, using s as parameter, 


acy ; 
„2 _ FOUN. æ 2 
ó = = E + N”. 
(Z : i 

Since 22 + 4? = 1, we obtain from our formulas (59), 


G2 = pè. 
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Figure A.2 String construction of the involute C of a curve E: pı = po + 01 — Oo. 


arc of the evolute E and stretched so that a part of it extends tangentially 
away from the curve to it; if in addition the end point Q of this thread 
lies initially on the original curve C, then as we unwind the thread Q 
will describe the curve C. This accounts for the name evolute (evolvere, 
to unwind). The curve C is called an involute of the evolute E. On the 
other hand, we may start with an arbitrary curve E and construct its 
involute C by this unwinding process. Then conversely E is seen to be 
the evolute of C (Fig. A.2). 

For the proof we consider the curve E, which is now the given curve, 
as given in the form é = (a), 7 = n(o), where the current rectangular 
coordinates are denoted by & and 7 and the parameter ø is length of 
arc on E. The winding is done as indicated in Fig. A.3; when the 


E 


Figure A.3 
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thread is completely wound on to the evolute E, its end Q coincides with 
the point A of E corresponding to some arc-length a. If the thread is 
now unwound until it is tangent to the evolute at the point P, corre- 
sponding to the length of arc o > a, the length of the segment PQ will be 
(o — a) and its direction cosines will be —£ and —7, where the dot 
now denotes differentiation with respect to ø. Thus for the coordinates 
x, y of the point Q we obtain the expressions 


(60) x=¢— (o-a, — y= -— (0 — a), 


which give the equations for the involute described by the point Q in 
terms of the parameter ø. By differentiation with respect to o we obtain 


eis č = È — Ë + (a — o)Ë = (a — o)È, 
y = ġġ) — 7 + (a — o)ij = (a — o)ij. 


Since Ë + 77) = 0, we at once find that 
fe + Hy = 0, 


which shows that the line PQ is normal to the involute C. We can 
therefore state that the normals to the curve C are tangent to the curve E. 
Since the tangent to E has direction cosines Ë, 7) we find for the direction 
cosines of the tangent of C the expressions 


eee =r eee See eer = 
Ja + ¥ 


62 E, 
Sa Vi ++ y? 


Differentiating the relation £z + ġġ = 0 with respect to ø and sub- 
stituting for £, 7, Ë, ij, their expressions from the previous equations 
(61), (62) shows that 

i to po HY? Key + FY 

0 = e+ iý + E R a ELEA 

a—0o Vipi? 
Hence the radius of curvature of the curve C corresponding to the point 
Q = (2, y) turns out to be (see formula (15) on p. 355) 


K = LY — YX 
This is also the distance of the point Q from P = (é, 7). Because P 
also lies on the normal to C at Q, we have in P the center of curvature 


of C corresponding to the point Q. Thus every curve E is the evolute 
of all its involutes. 
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Examples. We consider the evolute of the cycloid 
x=n+i+sint, y = —l — cost. 


By Eq. (17), p. 359, the center of curvature (&, ņ) for a curve referred 
to an arbitrary parameter t is 


pagu ak PA 


A short computation yields then for the evolute of the cycloid 


Jia aa 


E=r+t—snt, n=1+cost. 
If we put : = 7 — 7, then 
E+nr7=n7+7+ sin7, n — 2 = —]l — cos T: 


these equations show that the evolute is itself a cycloid which is similar 
to the original curve, and can be obtained from it by translation as 
indicated in Fig. A.4. 


Figure A.4 The cycloidal pendulum. 


This gives us a simple method of constructing a cycloidal pendulum 
(see p. 412). Ifa mass P is attached by a thread of length 4 to one of the 
cusps of the evolute, then under tension the thread will partly coincide 
with the evolute and lie along a tangent to the evolute the rest of the way. 
The mass P is then forced to lie on the involute, that is, on the original 
cycloid. Under gravity P must describe an isochronous motion over 
some portion of the cycloid with a period independent of the position 
at which P begins the motion. (The parameter ¢ to which the cycloid is 
referred does not correspond to the time in the isochronous motion.) 
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This curve is called an astroid. Its graph is given in Fig. A.6. By 
means of the parametric equations we may readily convince ourselves 
that the centers of curvature corresponding to the vertices of the ellipse 
are actually the cusps of the astroid. 


Figure A.6 Evolute of the ellipse. 


*A.2 Areas Bounded by Closed Curves. Indices 


In Section 4.2 the oriented area bounded by a closed curve x = z(t), 
y = y(t), « < t < P, which nowhere intersects itself (a so-called simple 
closed curve), was represented by the integral 


heise Í "y(t)e(t) dt; 


the value obtained is positive or negative depending on whether the 
sense in which the boundary is described is counterclockwise or clock- 
wise. This formula remains meaningful as a definition of A if we allow 
self-intersections of curves. It remains to see how A is related to areas 
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in such cases. Suppose that the curve C, given by the equation z = z(t), 
y = y(t), intersects itself in a finite number of points, thus dividing the 
plane into a finite number of portions R,, Ro,.... Suppose further 
that the derivatives are continuous and that 2? + y? # 0, except per- 
haps for a finite number of jump-discontinuities (which may or may 
not correspond to corners). Finally, it is assumed that the curve has 


Figure A.7 Indices x; of regions R; formed by oriented closed curve. Figure A.8. 


a finite number of lines of support x = constant, that is, vertical lines 
that are either tangent to the curve or pass through a point of self- 
intersection of the curve. 

To each region R, we then assign an integer, the index u;, defined in 
the following way: We choose an arbitrary point Q in R,, not lying on 
any line of support, and erect the half-line extending from Q upward 
in the direction of the positive y-axis. We count the number of times 
the curve C for increasing ¢ crosses the half-line from right to left, 
and subtract the number of times the curve C crosses from left to right; 
the difference is the index w,. For example, the interior of the curve 
illustrated in Fig. 4.17, p. 343, has the index u = +1; and in Fig. A.7 
the regions R,,..., Rs Re have the indexes uw, = —l, 4 = —2, 
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ua = —l, wy =O0, us =1 and ue =0. This number yu, actually 
depends on the region R, only and not on the particular point Q chosen 
in R;, as we readily see in the following manner. We choose any other 
point Q’ in R;, not on a line of support, and join Q to Q’ by a broken 
line lying entirely in the region R, (Fig. A.8). As we proceed along this 
broken line from Q to Q’ the number of right-to-left crossings minus 
the number of left-to-right crossings is constant; for between lines of 
support the number of crossings of either type is unchanged, whereas 
on crossing a line of support the number of crossings of both types 


Figure A.8 


either stay the same or both numbers increase by one or both decrease 
by one; in every case, the difference is unaltered. Here a line of support 


that meets the curve at several different points, say A, B,..., H, is 
considered as several different lines of support, FA, FB,..., FH, 
where F is a point vertically below all the points A, B,..., H. Our 


argument then applies to each of these lines. Hence the number yu, has 
the same value whether we use Q or Q’ in determining it. 

In particular, if our curve does not intersect itself, the interior of the 
curve consists of a single region R whose index is +1 or —1 depending 
on whether the sense in which the boundary is described is counter- 
clockwise or clockwise. To see this we draw any vertical line (not a 
line of support) intersecting the curve; on this line we find the highest 
point of intersection P with the curve, and in R we choose a point Q 
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below P and so near it that no point of intersection lies between P 
and Q. Then above Q there lies one crossing of the curve, which if the 
curve is traversed in the counterclockwise sense must be a right-to-left 
crossing, so that u = +1; otherwise u = —1. As we have just seen, 
this same value of u holds for every other point of R. For such a 
curve, and, in fact, for all closed curves, one of the regions, the ‘‘out- 
side”? of the curve, extends unboundedly in all directions; we see 
immediately that this region has index 0, and ignore it in what follows. 
Then the relation between the integral A and the areas of the regions 


R; is given by the following theorem: 
B 
THEOREM. The value of the integral -| yx dt is equal to the sum 


x 


of the absolute areas of the regions R,, each area R; being counted u; 
times; in symbols 


p 
-Í ye dt = > u; \area R,|. 


PROOF. The proof is simple. We assume, as we are entitled to do, 
that the whole of the curve lies above the x-axis. (Adding a constant 
to y does not change the value of the integral A for a closed curve.) 
The lines of support cut R, into a finite number of portions; let r be 
one of these portions. Then on taking the integral — f yt dt = — Í y dx 
for each single-valued branch of the function y = y(x) and interpreting 
it as area between the curve and the z-axis, we find that the absolute 
area of r is counted +1 times for each right-to-left branch above r and 
—1 times for each left-to-right branch above r; in all, u, times. The 
same is true for every other portion of R,;; hence R, is counted yp, 
times. Thus the integral round the complete curve has the value 
& u, |area R,|, as stated (cf. Fig. A.7). This formula agrees with what 
we have found for simple closed curves, as we recognize from the 
discussion of the values of u for such curves. 


The definition given for the index u; has the disadvantage of being 
stated in terms of a particular coordinate system. As a matter of fact, 
however, it can be shown that the value of u, assigned to a region 
R; is independent of the coordinate system and depends solely on the 
curve. This can be readily seen by identifying u, with the total number 
v; of times a point on the curve for t increasing from « to p runs about 
any fixed point Q, of R, in the counterclockwise sense, that is with 
the number of times C winds around Q;. We shall prove the identity 
of u; and »,. 

Let C be given parametrically by x = z(t), y = y(t) wherea < t < A. 
Let Q = (&, 7) be a point which does not lie on a line of support of C. 
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We take Q as origin of a system of polar coordinates r, 0 in which 


SS ap e 
r r 


r=V(x—é+(y—7), cosd= 


The polar angle 6 is determined only within whole multiples of 27; 
however, 0 is determined uniquely as a function of t by its value 6, 
for t = « if we require 6 = O(t) to vary continuously with ¢ along the 
curve C. At £ = £ the angle 6 will then have a value 6(f) = 6) + 2»7, 
where v is an integer. The number 


1 1 [' dé 1 
= — [0(8) — 0 =i] —dt=— | dé 
í zn! ($) (2) 27 Ja at an JC 


represents the number of times that the oriented curve C winds 
around Q. 

The curve C crosses the vertical half-line through Q for those values 
of ¢ for which the expression (1/277)[6(t) — 7/2] has an integral value n. 
Consider for a fixed n the t-values in the parameter interval for which 
(1/27)(6 — 7/2) =n. Let o, and 7, be the number of such t-values 
for which d6/dt > 0, respectively d@/dt < 0. Obviously, the index at 
the point Q is 


u=} 0o, — 9r, =} (on —7T,). 


On the other hand, o, — 7, can only have one of the values 1, 0, 
—1, for the graph of 6(¢) in the 6, t-plane must cross the line 0 = 
a/2 + 2nr alternately from above or below. Actually, we have 
o, — T, = sign [0(8) — O(a)) if 7/2 + 2n7 lies between 6(«) and 0(8) 
and a, — 7, = 0 otherwise. 

Consequently, u equals the number of values of the form 7/2 + 2n7 
with an integer n that lie between 6(«) and 6(8) taken with the sign of 
6(8) — O(a); that is, u equals the number v. 

Since 6 = arc tan [(y — 7)/(x — §)], we have 


dO _ y(z — $) — Hy — n) 


dt («—f°+(y—n) 


This yields for the index u of the oriented closed curve C with respect 
to the point (&, 7) the integral representation 


Lf e-DA ar 
2n Ja (x — EP + (y — n? 
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which can be simply written (see p. 367) without referring to the param- 
eter t explicitly: 
yal f@=Hdy = — de 
Indo (EP +y- © 


The remarkable feature of these results is that the integer u or v which 
describes a topological relation between the point Q and the curve C 
can be determined analytically, from the parameter representation of 
C, by evaluating an integral. 


PROBLEMS 


SECTION 4.lc, page 328 


1. Sketch the hypocycloid for a = 4c (the astroid) and find its nonpara- 
metric equation. 


2. Prove that if c/a is rational the general hypocycloid is closed after the 
moving circle has rotated an integral number of times, whereas if c/a is 
irrational, the curve has infinitely many points where it meets the circum- 
ference of the fixed circle and will not close. 


3. Derive the parametric representation 
r =at — bsint, y =a — bcost 


for ordinary trochoid, that is, for the path of a point P attached to a disc 
of radius a rolling along a line, P having the distance b from the center of 
the disc (see Fig. 4.7). 


4. Find the parametric equations for the curve 8 + y? = 3ary (the folium 
of Descartes), choosing as parameter ¢ the tangent of the angle between the 
a-axis and the ray from the origin to the point (x, y). 


SECTION 4.le, page 343 


1. The angle x between two curves at a point of intersection is defined 
to be the angle between their tangents at the point. Find a formula for 
cos « in terms of the parametric representations of the curves. 


2. Let v = f(t) and y = g(t). Derive formulas for d?y/dx? and d?y/dx? in 
terms of derivatives with respect to the parameter t. 


3. Find the formula for the angle « between two curves r = f(0) and 
r = g(0) in polar coordinates. 


4. Find the equations of the curves which everywhere intersect the straight 
lines through the origin at the same angle «. 


5. Prove: if x = f(t) and y = g(t) are continuous on the closed interval 
[a, b] and differentiable on the open interval (a, b) with x’? + y? > 0, then 
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there is at least one point on the open arc 


c=fi), y=g, (a<t<d), 
where the tangent is parallel to the chord joining the end points. 


6. Let P be the point of a circle which traces out a cycloid as the circle 
rolls on a given line. Let Q be the point of contact of the circle with the line. 
Prove that at any instant, the normal to the cycloid at P passes through Q. 
What similar property holds for the tangent at P? 


7. Prove that the length of the segment of the tangent to the astroid, 
x = 4c cos? 0, y = 4c sin? 6, 
cut off by the coordinate axes is constant. 
*8. Show that the two families of ellipses and hyperbolas, (0 < a < b) 


2 2 


a 


q 
rs y 


Zee fog for0 <2 <a, 


x? y? 
joni 


z.z pa7b fora <r <b, 
ae = 


are confocal (that is, have the same focii) and intersect at right angles. 
9. (a) Show for the ellipse that the angle between the two rays from the 
foci to a point on the curve is bisected by the normal at the point. 
(6) Show for the hyperbola that the angle is bisected by the tangent. 
SECTION 4.1f, page 348 
1. Prove that the curve defined by 
l 
2 sin-, O<xr<l 
y = U 


0, « =0 


has finite length, but that the continuous curve defined by 
1 
vrsin—, 0O<rx sl 
y= E 
0, x =0 
is not rectifiable. 
2. Prove that if the function f is defined and monotone on the closed 
interval [a, b], then the arc defined by 


y=f(x), (aw <b), 
is rectifiable. 


SECTION 4.1g, page 352 
1. An elliptic integral of the second kind has the form 


Ws ge rt EEES 
[i — k? sin? 0 d0. 


0 
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(a) Show that the arc length of the ellipse v = a cos 6, y = b sin 0 can 
be expressed in terms of an elliptic integral of the second kind. 
(b) Do the same for the trochoid 


£z =at — bsint, y =a — bcost. 


*(c) Show that the arc length of the hyperbola can be expressed in terms 
of elliptic integrals of the first and second kinds. 


SECTION 4.1h, page 354 


1. Let P be a point of the rolling circle which generates a cycloid and let 
Q be the lowest point of the circle at any given instant. Show that Q 
bisects the segment joining P to the center of the osculating circle of the 
cycloid at P. 


2. Find the center of curvature for y = x? when « = 0. Determine the 
point of intersection of the normal lines to the curve when £ = 0 and when 
x = e. Calculate the distance of the intersection from the center of curvature. 
Suggest an alternative definition for the center of curvature. Prove that this 
definition is equivalent to the definition given in the text. 


3. Consider the question of whether the osculating circle crosses the curve 
at the point of contact. 


*4. Prove that the circle of curvature at a point P of the curve C is the 
limit of the circles through three points P, P,, P as P, and P, tend to P. 


5. Let r = f(0) be the equation of a curve in polar coordinates. Prove 
that the curvature is given by the formula 


_ or? — rr’ +r? 
o Ce ee 
where 
d, {? 
r= a. rss J 
do do- 


6. The curve for which the length of the tangent intercepted between the 
point of contact and the y-axis is always equal to I is called the tractrix. 
Find its equation. Show that the radius of curvature at each point of the 
curve is inversely proportional to the length of the normal intercepted between 
the point on the curve and the y-axis. Calculate the length of arc of the 
tractrix and find the parametric equations in terms of the length of arc. 

7. Letv = r(t), y = y(t) be a closed curve. A constant length p is measured 
off along the normal to the curve. The extremity of this segment describes 
a curve which is called a parallel curve to the original curve. Find the area, 
the length of arc, and the radius of curvature of the parallel curve. 


8. Show that the only curves whose curvature is a fixed constant k are 
circles of radius 1/k. 

*9, If the curvature of a curve in the xy-plane is a monotonic function of 
the length of arc, prove that the curve is not closed and that it has no 
double points. 
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SECTION 4.1i, page 360 


1. Show that the expression for the curvature of a curve x = x(t), y = y(t) 
is unaltered by rotation of axes and also by change of parameter given by 
t = ¢(r), where ¢‘(r) > 0. 

SECTION 4.3d, page 394 


1. Prove if the acceleration is always perpendicular to velocity that the 
speed is constant. 


2. The velocity vector, considered as a position vector, traces out a curve 
known as the hodograph. Show whether or not a particle moving on a 
closed curve may have a straight line as its hodograph. 


3. Assuming the rolling circle moves at constant speed, find the velocity 
and acceleration of the point P which generates the cycloid. 


4. Let A be a fixed point of the plane and suppose that the acceleration 
vector for a moving point P is always directed toward A and proportional 
to the 1/|AP|*. Prove that the hodograph (cf. Problem 2) is a circle. 


5. Let A be a fixed point ona circle. Let P be a point of the circle moving 
so that the acceleration vector points to A. Prove that the acceleration is 
proportional to |AP|~°*. 


SECTION 4.5, page 402 


1. A particle moves in a straight line subject to a resistance producing 
the retardation Au’, where u is the velocity and A a constant. Find expressions 
for the velocity (u) and the time (z) in terms of s, the distance from the initial 
position, and vp, the initial velocity. 


2. A particle of unit mass moves along the .-axis and is acted upon by a 
force f(x) = —sin.r. 

(a) Determine the motion of the point if at time ¢ = 0 it is at the point 
x = 0 and has velocity rọ = 2. Show that as t —- œ the particle approaches 
a limiting position, and find this limiting position. 

(b) If the conditions are the same, except that rg may have any value, 
show that if va > 2 the point moves to an infinite distance as r — «, and 
that if vy < 2 the point oscillates about the origin. 


3. Choose axes with their origin at the center of the earth, whose radius 
we shall denote by R. According to Newton's law of gravitation, a particle 
of unit mass lying on the y-axis is attracted by the earth with a force —M/y", 
where x is the “gravitational constant” and M is the mass of the earth. 

(a) Calculate the motion of the particle after it is released at the point 
Yo (> R); that is, if at time z = O it is at the point y = y and has the velocity 
Vo = 0. 

(b) Find the velocity with which the particle in (a) strikes the earth. 

(c) Using the result of (b), calculate the velocity of a particle falling to the 
earth from infinity.? 


*4, A particle perturbed slightly from rest on top of a circle slides down- 
ward under the force of gravity. At what point does it fly unconstrained off 
the circle? 


1 This is the same as the least velocity with which a projectile would have to be fired 
in order that it should leave the earth and never return. 
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*5. A particle of mass m moves along the ellipse r = k/(1 — e cos 9). 
The force on the particle is cm/r? directed toward the origin. Describe the 
motion of the particle, find its period, and show that the radius vector to the 
particle sweeps out equal areas in equal times. 


SECTION 4A.1, page 424 


1. Show that the evolute of an epicycloid (Example, p. 329) is another 
epicycloid similar to the first, which can be obtained from the first by rotation 
and contraction. 


2. Show that the evolute of a hypocycloid (Example, p. 331) is another 
hypocycloid, which can be obtained from the first by rotation and expansion. 


5 


Taylor's Expansion 


5.1 Introduction: Power Series 


It was a great triumph in the early years of Calculus when Newton 
and others discovered that many known functions could be expressed 
as “polynomials of infinite order” or “power series,” with coefficients 
formed by elegant transparent laws. The geometrical series for 1/(1 — x) 
or 1/(1 + 2?) 


1 


1—2x 


(la) l =] t a — oh pet I H 
1+ 2 
valid for the open interval |x| < 1, are prototypes (see Chapter 1, p. 67). 
Similar expansions of the form 


f(®) = apt aye ters Ha, H 
=0 


with numerical coefficients a,, will be derived in this chapter for many 
other functions. 
The following are striking examples: 


3 a g” , 
e RR T pipe 


n! 


: x? x (— 1 VE al HI 
ui 31 5! (2n + 1)! 
x? rê (=p 
ralas = ea Uo 
gi 2! ti x £ (2n)! 


These series expansions are valid for all z. 
440 
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Newton's General Binomial Theorem. The expansion 


a(x = 1) 9 
$e 
RA 


ame 


(ata) 4 et 


a aes 

S fa\. 

=> ( |e 
v=0 \P. 


is valid for |x| < 1 and any exponent a. 


To explain the precise meaning of such expansions, we consider the 
polynomial of order n formed as the sum of the first n + 1 terms of 
the series, the nth “partial sum,” 


n 
5, = ya. 
v=0 
The formula 


f(e)=DYa,x", for jel <a 
v=0 


then means: For n— œ the sequence S, tends to the value of the 
function f(x) at each point x in the interval |r| < a. The infinite series 
is then said to converge to f(x) in the interval |x| < a. The difference 


R,(x) = f(x) — S,(), 


the “remainder” of the series, measures the precision with which f(x) 
is approximated by the polynomial S,(x) at x. For example, 


l 2 n 
-=l +ete ttr + R,(x), 


— T 


(1b) 


where the remainder R,(x) = 2"'!/(1 — x) tends to zero for |r| < l 


ea 
as n increases; thus the infinite geometric series > x” = I/(1 — x) 


v=0 
results. To find simple manageable estimates for R,, in specific cases 
is a task of both theoretical and practical importance. 

In this chapter we are concerned with such expansions for a wide 
class of functions, including all the “elementary” transcendental 
functions. It is a striking fact that in these expansions of transcendental 
functions the coefficients are elegant expressions in terms of integers. 
The approach to these expansions will be by Taylor's theorem, later 
in Chapter 7 we shall discuss a different approach by a direct study of 
power series. 

It should be emphasized that often just as for the geometrical series 
of Eq. (la), the infinite expansion is not valid outside some interval 
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for x—(in the case of the geometrical series, the interval z? < 1) even 
though the function represented by the series is well defined outside 
this interval. 


5.2 Expansion of the Logarithm and the Inverse Tangent 


a. The Logarithm 


As simple examples we first derive expansions of the logarithmic 
and the inverse tangent functions by integration, from the geometric 
series 

1 
ae ee a ae ee 
aoe 
with r,(t) = j — t). 
We substitute this sum for the integrand in the formula 


—log (1 — x) =| a 


ol —t 


and integrate term by term, obtaining for x < 1 


2 3 4 n 
—log (1 — x)= r +2 +2454 +24 Rn), 
2 3 4 n 


with the remainder 


Ra) = rnat =| a 
0 0 = 


Hence for any positive integer n the function —log (1 — x) is approxi- 
mated by the polynomial of nth degree, 


ee a ee 

2 3 n` 
and the remainder R, indicates the “error” of this approximation. 
To appraise the accuracy of this approximation we estimate the 
remainder R,. If we at first suppose that —1 < x < 0, then in the 
entire interval of integration the integrand ¢”/(1 — t) in absolute value, 

nowhere exceeds [f”| = (—1)"r".. Thus 
|x 


fea : 
0 n+ 1 


hence for every value of x in the closed interval —1 < x < 0 including 
x = — | this remainder can be made as small as we wish by choosing n 


ane 


IR,| < 
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large enough (cf. p. 61). For x >O the end point x = | must be 
omitted; we have to restrict x to the half-open interval O0 < x < l; 
the integrand does not change sign and its absolute value does not 
exceed ”/(1 — x); we thus obtain for 0 < x < 1 the estimate 


| T grt 
as ee 
1—xJo (1 — x)(n + 1) 


Hence again, if x is fixed, the remainder is arbitrarily small when » is 
sufficiently large. Of course, the estimate has no meaning for x = 1. 
Summing up, 


(2) log(1 — x)= =z -> -=> — -> —R, 


where the remainder R,, tends to zero as n increases, provided that x 
lies in the half-open interval —1 < x < 1. 

In fact, this reasoning establishes a “uniform” estimate for the 
remainder, independent of x and valid for all values of x in the interval 
—] <x <1l— h, where h is any number such that 0< A< l1; 
namely, |R,| < 1/[(a + 1)A]. 

The fact that the remainder R, tends to zero in the half-open interval 
—] < x < | is expressed by saying that in this interval the logarithmic 
function is given by the infinite series’ 


x r? rê 
3 lo | — r) = =z — — — — — — — Ss 
o Bt 2 3 4 
If we insert the particular value r = —1 in this series, we obtain the 
remarkable formula 
(4) logp2=1—444-F4+---°. 


This is one of the relations whose discovery made a deep impression 
on the early pioneers of the calculus. 

For the open interval —1 < x < J, we have only to write —x in 
place of x in (2) in order to obtain 


2 3 4 n 
(Qa) Tog(l +ej= r= +e — 4+ + (IR, 
2 3 4 n 
where 
ce ier? 7t dt 
R,, =Í = =l 2 , 
(x) Jo I— l ) ol+t 


1 We leave it as an exercise to the reader to ascertain that for all values of x for which 
|z| > 1 the remainder not only fails to approach zero, but, in fact, that |R,| increases 
beyond all bounds as n increases, so that for such values of x the polynomial is not 
a good approximation of the logarithm and becomes worse with increasing n. 
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Taking n as even and subtracting (2) from (2a), we have 
1 1 + xX r? x? yr 
zlo (H) = artanne=2t +24 

2 : Il-— 2 3 5 n—1 


where the remainder R, is given by 


5 1 saat i 
Ra =-(R,— R; =| dt, 
2 | o1—? 
and where ar tanh z is defined according to p. 233. 
Observing that 1/(1 — £) < 1/(1 — 2*), we find by an elementary 
estimate of the integral that 


thus the remainder R, tends to zero as n increases, a fact again expressed 
by writing the expansion as an infinite series: 
J 1+ 2 x 


5) h a a 
— lo = ar tanh r= r+4— +++: 
( 2 Pena 3 5 7 


5 


for all values of x with |x| < 1. Incidentally, this result also could be 
derived directly by integrating the geometric series for 1/(1 — x?). It 
is an advantage of this formula that as x traverses the interval from 
—1to1,theexpression(1 + x)/(1 — x) ranges over all positive numbers. 
Thus, if the value of x is suitably chosen, the series enables us to calculate 
the value of the logarithm of any positive number, with an error not 
exceeding the above estimate for R,,. 


b. The Inverse Tangent 


We can treat the inverse tangent in a way similar to that of the 
logarithm, starting with the formula 


l =l. te tem te IDT Hr, 
I+t 
where now r, =(—!] i 
(—1) rae 

By integration [see Eq. (14), p. 263], we obtain 

r? x? gènl 

arctanz=x2——+——4+4-::+/(—1)"'——_ 4+ R,, 
3 5 ) 2n — 1 


x 2” 
R, = (—1) | —— dt; 
(m ol +? 
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we see at once that in the closed interval —1 < x < 1 the remainder R, 
tends to zero as n increases, since 

{ae[?n+2 

2n+1- 


From the formula for the remainder we can also easily show that for 
|x| > 1 the absolute value of the remainder increases beyond all bounds 
as n increases. 

We have accordingly deduced the infinite series 


Iæ 
[Ral <Í t” dt = 
0 


x? x? 2n—1 


6) arctan aS cs eee ee ees, 
( es oo i enter 


valid for the closed interval |x| < 1. Since arc tan 1 = 7/4, we obtain 


for x = 1, the Leibnitz-Gregory series 


(7) eed oe eae 


an expression as remarkable as that found earlier for log 2. 


5.3 Taylor’s Theorem 


Newton’s pupil Taylor, observed that the elementary expansion of 
polynomials lends itself to a wide generalization for nonpolynomial 
functions, provided that these functions are sufficiently differentiable 
and that their domain is suitably restricted. 


a. Taylor’s Representation of Polynomials 


This is an entirely elementary algebraic formula concerning a 
polynomial in x of order n, say 


SF (©) = a + ax + aor? + °° Haa". 


If we replace x by a + h = b and expand each term in powers of A, 
there results immediately a representation of the form 


(8) fla + h) = cot cht ch? H: H ek”. 
Taylor’s formula is the relation 


(8a) e, = L fa), 
v. 


for the coefficients c, in terms of f and its derivatives at xv =a. To 
prove this fact we consider the quantity h = b — a as the independent 
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variable, and apply the chain rule which shows that differentiation 
with respect to A is the same as differentiation with respect to b = a + h. 
Thus successively differentiating the formula (8) with respect to A and 
each time thereafter, substituting A = 0 yields successively the results 


co = f(a), cı = f(a)... v! c, = f(a) 


and therefore indeed the Taylor formula for polynomials: 
i , h? n h” (n) 
(9) fla +h) = f(a) + hf'(a) + a aeS a): 


The (n + 1)st derivative vanishes for a polynomial of degree n, and 
thus our formula (9) naturally terminates. 

As stated the formula (9) is nothing but an elementary algebraic 
rearrangement of a polynomial in powers ofa + h, into a polynomial in 
powers of A. 


b. Taylor’s Formula for Nonpolynomial Functions 


Newton and his immediate pupils boldly applied formula (9) to 
nonpolynomial functions for which the expansion does not auto- 
matically stop at the nth term; instead they simply allowed n to 
increase to infinity, a procedure which for many of the important 
special functions will be justified later on. 

Assuming the function f differentiable at least n times in an interval 
containing the points a and a + h we certainly can no longer write for 
f(a + h)an expression as in (9) of a finite number of powers of h, but 
must account for the discrepancy by an additional “remainder”? R,„, 
writing tentatively 


(10) f(b) = f(a + h) = f(a) + hfa + ~ #"(a) + R,: 


in fact, (10) is nothing but a definition of the corrective remainder term 
R,, and indicates the expectation that R„ might become small and tend 
to zero for n—> œ. If the remainder indeed tends to zero, then the 
formula (10) in the limit n — œ leads to an expansion 


(11) fla +h) = f(a) Ha +o +E FMa) + 


of f(x) as an infinite power series in A. 
The crucial problem, far transcending in difficulty that of the alge- 
braic manipulations in Section 5.3a is then to find estimates for the 
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remainder R, so that the accuracy of Taylor’s representation by the 
finite Taylor polynomial of order n in h 


(12) T,(h) = yo =o) h 


and the passage to the limit for n —> œ, can be rigorously explored. 
Taylor’s polynomial 7,(h) is an approximation to f(a + h) in the 
sense that at h = 0 the functions T, and f, as well as their derivatives 
up to order n coincide, so that the difference R, = f — T, vanishes at 
x = a together with its first n derivatives. 


5.4 Expression and Estimates for the Remainder 
a. Cauchy’s and Lagrange’s Expressions 


A direct representation of the remainder R,„, allowing estimates of 
its absolute value |R,l, is the core of Taylor’s theorem. The results are 
easily obtained on the basis of the mean value theorem of calculus., 
They are moreover related to the /inear approximation of functions by 
differentials (see p. 179). 

Let us first examine again this approximation. 

The definition of derivative at the point a states merely that 
f(a +h) = f(a) + hf'(a) + he, where «0 for h—0. We can 
attain a somewhat sharper approximation by ascertaining that e is in 
fact of order at least as small as A, provided that not only f’ but also 
f” exists and is continuous in our interval J. The estimate is obtained 
if we write again a + h = b, introduce a remainder R by 


(13) f(b) = f(a) + (b — a)f (a) + R, 


and now consider b as fixed and the initial point a as variable; this 
equation defines R as a function of a in the interval J; then differen- 
tiation with respect to the variable a yields zero on the left-hand side 
since f(b) is constant and the rule for differentiating a product shows 
that 

0 = f'(a) — f'(a) + (b — a) f"(a) + R'(a) 
and hence 
(14) —R'(a) = (b — a) f (a). 
Now, for a =b we obviously have R(b) = 0. By the mean value 
theorem of calculus [R(a) — R(b)]/(b — a) = —R’(é), where & is a 
not otherwise specified value between a and b; because of R(b) = 0 we 
therefore conclude R(a) = —(b — a)R’(&) = —hR‘(E). Now by (14) 
R'(é) = —(b — E) f" (E) and hence |R’(é)| < h | f"(é)| since |b — &| < h. 
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Since | f"(£)| is bounded in an interval around a, we obtain finally an 
estimate that shows that the remainder or “error” R, is small of at 
least second order in A: 


(15) IR(@)| < h? If. 


We turn from the special case n = 1 to that of any order n. The 
direct characterization of the remainder R,, is achieved by the same 
device as for n = 1. We assume that a and b = a + hA are points in an 
interval J in which f(x) is defined and has continuous derivatives up 
to the ordern + 1. We consider a as the independent variable and keep 
the end point b fixed. In formula (10), p. 446, which defines R,,(a), we 
write b — a instead of h. Differentiating and taking into account that 
f(b) is constant, we find from the product rule that almost all terms 
cancel out, and we are left with the formula. 


nay 


(16) 0=—— fra) ER, G) 


for every value a in the interval. Since for a = b the remainder R, is 
zero, this direct expression for its derivative as a function of a completely 


a b 
characterizes R,, as the integral 7 R, (t) dt = — Í R, (Ð dt or 


(17) R (a) = =| C= ies (5 =O" pasty dt. 


This is an exact integral representation at the remainder. 
An estimate for R, similar to the one obtained above for n = 1 
follows directly by the mean value theorem of calculus applied to (16): 


R,(a) — Rad) _ Rul) ap yey ease reve) 


or 


(18) R,(a) = 2— OO = 


sy" peek £), 


where ¢ is a suitable, not specified, intermediate value between a and b. 
The same estimate can also be obtained by applying to the expression 
(17) the mean value theorem of integral calculus (Chapter 2, p. 141). 


Cauchy's Form of the Remainder. lf we define =a + 6h 
=a + 0(b — a) we obtain Cauchys formula for the remainder in 
Taylor’s formula (10) 


(19) R,(a) = inl (1 — Ofa + 6h), 
n: 


where 6 is an unspecified quantity between 0 and 1. 
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We can also apply to the integral (17) for R, the generalized mean 
value theorem of integral calculus (see p. 142) taking for the “weight 
function” p(t) the expression p(t) = (b — t)” which does not change 
sign throughout the interval of integration.’ Then 


(b a ayn ponte), 


| b 
20 = — frt) Í — 4\n = 
(20) R, mi (8) a t)” dt (n+ D! 


Lagrange’s Form of the Remainder. Setting again =a + 0h 
ylelds Lagrange’s form for the remainder 


Art} 
21 = (n+1) 
(21) R,(a) n+)! pi (a + Gh) 


with a suitable quantity 0 satisfying O < 6 < 1. Lagrange’s form is 


particularly suggestive, and hence more commonly applied, since it 
makes the remainder R, in the formula 


(22) f(a + h) = f(a) + TO + ZS free 
= T fa) + R, = P,(h) + R, 


look like the term A"*1 f@*)(qa)/(n + 1)! that would arise in the 
expansion (22) to one order higher, only with the argument a replaced 
by the intermediate value a + Oh. 

For a function f for which f'"*" is continuous in a closed interval 
containing the point a, the quantity | f("t"(é)| has a fixed bound M. 


Since then 
Ayr 


(n + 1)! 


IRIS 


the Taylor polynomial P,,(A) gives for fixed n an approximation to the 
function f(a + h) with an error of order at least n + l in A. 

Our interest will be directed chiefly toward the question whether the 
remainder R,„ tends to zero as n increases; if this is the case, we say 
that we have expanded the function in an infinite Taylor series 


2 3 
63) Jah ayaa: = fa) F =a) + TO jerez 


1 The generalized mean value theorem was proved for the case of a positive p(t), 
but it applies equally well when p(t) is negative throughout the interval of integration. 
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in particular, if we first put a = 0 and then write z in place of h, we 
obtain the “power series” 


f(z) = f0) + ZrO) ZO+. 


We shall discuss examples in Section 5.5. 

For applications the finite Taylor expansion (22) for a fixed n with the 
remainder term is just as important. If we let A tend to zero in this 
formula in the terminology of Chapter 3, p. 252, the various terms of 
the series tend to zero with different orders of magnitude in h. The 
expression f(a) represents the term of zero order in Taylor’s series, the 
expression Af‘(a) the term of first order, the expression h2f"(a)/2! the 
term of second order, etc. We see from the form of the remainder 
that in expanding a function as far as the term of nth order we make an 
error which tends to zero of order (n + 1) as A tends to zero. The 
nearer the point a + A lies to the point a, the better is the representation 
of the function f(a + h) by the approximating polynomial P,(/); 
in the cases of greatest interest the approximation in the immediate 
neighborhood of the point x can be improved by increasing the value 
of n. 


b. An Alternative Derivation of Taylor’s Formula 


The integral representation (17) for the remainder term R,, in Taylor’s 
theorem was based on formula (16) for R,,’(a). Because of the importance 
of the theorem we give here a different version of the derivation, which 
leads directly to the expression for R„ by repeated integration by parts 
starting with the formula: 


(24) f(b) — f(a) =| ro dt. 


To transform (24) by successive integration by parts, we introduce 
the functions 


. $,(t), palt), saa ġ,(t), e.. 
by the relations: 


(25) PAND =1, AA = p(t) 
and the conditions 
(26) d,(b) = 0, for r>1 


where we consider b as a fixed parameter. Clearly, the conditions (25) 
and (26) determine successively all ¢,(t) As is verified immediately 
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the ¢,(t) are just the polynomials 


g) == 


We note in passing that the functions ¢, originate from each other by 
successive integration, leaving constants of integration open; therefore the 
defining conditions (25) could also be satisfied by functions satisfying other 
side conditions instead of (26) (see p. 189). 


Since ¢,(a) = (—1)'(6 — a)"/v! and ġ,(b) = 0, we obtain 


b 
-f $f” dt; 


b b 
f(b) — fla) |g i= dif d= df 
integrating the last term again by parts, we find 
f(b) — fla) = (b — a) f'(a) 7 af" dt 


(b — a) 


= (b — a)f'(a) + ea ) +f dof” dt, 


and repeating the process n times, 


f(b) — f(a) = (b — a) f'(a) + f(a) 


— mX 
> A f'a) +e 


(b— a) 
n! 


he (—1)"d, f(r) dt 


f(a) 4 Re 


swaar oa e Hay a „+29 


where, by the definition of ¢,,, 


b oes 
R, = [po t) dt 


Thus we have again proved 


TAYLOR’S THEOREM. Jf a function f(t) has continuous derivatives 
up to the (n+ 1)th order on a closed interval containing the two 
points a and b, then: 


f(b) = f(a) + (b — afa) +: 4+ 


(b =o a)” f(a) $ R, 
n! 
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with the remainder R,,, depending on n, a, and b, given by the expression 


(27) R, = 1 fo — tf! (t) dt. 


n 
By changes of notation we obtain slightly different expressions of the 

Taylor formula. Thus, replacing a by x and b by x + h, we have 
(27a) f(x + h) = f(z) + hf'(z) +: + = fa) + R,, 
with 

1 zth 

R, = Lf (x + h — Nf (t) dt; 
N.i Jag 


or witht =z +7, 


h 
(27b) R, = Lf (h — rifts + 7) dr. 
n! Jo 
If we set x = 0 and write x in place of A, we obtain’ 
£ £ p we 
(27c) MOISIO EOE OE 


+= JOO + R, 
with the remainder 
R, = Li fe — "f M1) dt. 
n! Jo 
Applying the mean value theorem of integral calculus or its generalized 


form to the integral leads to the Cauchy formula 


_ (1 — 6)" 
7 n! 


R at tpn a) 


n 


and respectively, the Lagrange formula 


grt! 
F (n + 1)! 


for the remainder, as was shown before (p. 448). Here 0 is a suitable 
nonspecified number with 0 < 0 < 1 (not the same in both formulas). 


f(z) 


n 


1 This special case of the theorem is sometimes without historical justification, 
called Maclaurin’s theorem. Taylor's general theorem was published in 1715; 
Maclaurin’s special result, in 1742. 
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As an exercise, the reader should construct functions ¢, satisfying 
(25) for which the side conditions (26) are replaced by the relations 


1 
[ dnar=o 
0 
for v > | (see Chapter 8, Appendix A). 


5.5 Expansions of the Elementary Functions 


The preceding general results permit us to expand the simple elemen- 
tary functions in Taylor series. Expansions of other functions will be 
discussed in Chapter 7. 


a. The Exponential Function 


First we expand the exponential function, f(x) = e”. In this case 
all the derivatives are identical with f(x) and have the value 1 for 
x =0Q. Lagrange’s form for the remainder (p. 449, Equation (21)), 
yields at once the formula: 


x r? r” grt 1 


Palte poet 454 
1h" 2! 3! n! (n +1)! 


ee 0<0<1. 
If we now let n increase beyond all bounds, the remainder R,, tends 
to zero for any fixed value of x. To prove this we note first that 
e® < ell since e” is a monotone increasing function. Let m be any 
integer greater than 2 |x|. Then for all k > m, |x|/k < 3, and 


xe | lett del de 
(n + 1)! m! m+1 n+ 1 
ae ee S 
< m! PA < m! an 
so that 
[R,| < 22 |] i 
m! 2” 


Since the first two factors on the right are independent of n, whereas 
1/2” — 0 for n — œ our statement is proved. 
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The function e” therefore is represented by the infinite series 


This expansion is valid for all values of x. In particular, for x = 1 we 
obtain again the infinite series that served to define the number e in 
Chapter 1 (cf. p. 77). 

Of course, for numerical calculations we must make use of the 
form of Taylor’s theorem with the remainder; for x= 1, for 
example, (compare with similar computation on p. 78) we have 


1 1 1 e° 
A E e R TE 

gh 3! n! (n+1)! 
If we wish to calculate e with an error of at most 1/10,000, we need only 
choose n so large that the remainder is less than 1/10,000, and since 
this remainder is certainly less? than 3/(n + 1)!, it suffices to choose 
n = 7, since 8! > 30,000. We thus obtain the approximate value 
e = 2.71825, with an error less than 0.0001. 


b. Expansion of sin x, cos x, sinh x, cosh x 


For the functions sin z, cos x, sinh x, cosh x we find the following 
formulas: 


f(x) = sina cosx sinhz cosha, 
f= cosx -—sinz coshz sinha, 
f") = -—sinz —cosx sinhzx cosh2, 
St") = —COS r sinx coshz _ sinha, 

fx) = sina cosx sinhz coshz. 


Thus in the approximating polynomials in x for sin x and sinh x, the 
coefficients of the even powers of x will vanish, whereas in those 
for cos x and cosh z the coefficients of the odd powers vanish. 


1 Here we have made use of the fact that e < 3. This follows (cf. p. 78) from our 
series for e; for it is always true that I/n! < 1/2"-', and therefore 
1 
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When we use Lagrange’s form of the remainder (21), p. 449, the Taylor 
series for our functions take the form: 


3 5 —1)\"y2n+1 

3! 3! (2n + 1)! 

(2n + 3)! 

$ x$ 1) pen 

2! 4! (2n)! 

(2n + 2)! 

3 5 2n+1 
sine EE E = eee 

3f 3! (2n + 1)! 

(2n + 3)! 


2n 


(2n)! 


xo! 
cosh x = 1 +— + — + 
2! 4! 


xt? cosh (Ox) 
(2n + 2)! 


where, of course, in each of the four formulas @ denotes a different 
number in the interval 0 < 0 < 1, a number which in addition depends 
on n and on x. Since in each of these formulas, the remainder tends to 
zero as n Increases, as can be seen by exactly the same argument as in 
the case of e”, we can make the approximations as precise as we wish. 
We thus obtain the four infinite series, valid for all values z: 


5 


; x x Fo (— 1 Jarn 
sn rt = t — — + 4+4 [= , 
31S! vo (2v + 1)! 
r? xê oo (—1)"x 
cosx=1—-—+-—-4+:°°'= 
2! 4! vo (29)! 
r? x v geil 
sinh xz = x + — aa ii 
oh TO De 


cosh z = 1 + oer 
i n+ v= o (2v)! 


The last two may also be obtained formally from the series for e” in 
accordance with the definitions of the hyperbolic functions (see p. 228). 
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c. The Binomial Series 


We pass over the Taylor series for the functions log (1 + x) and are 
tan x already treated directly in Section 5.2. We shall, however, take up 
the generalization of the binomial theorem for arbitrary exponents, 
which was one of the most spectacular of Newton’s mathematical dis- 
coveries. We wish to expand the function f(x) = (1 + x)” in a Taylor 
series where x > —1 and « is an arbitrary number, positive or negative, 
rational or irrational. The function (1 + x)” is chosen instead of x* 
since for the latter at the point x = 0 it is not true that all the deriv- 
atives are continuous, except in the trivial case of nonnegative integral 
values of «. We first calculate the derivatives of f(x), obtaining 


f(x) = a(l + x), 
f") = ala — 1)(1 + e, 
f(x) = ala — 1) (a — or $+ DU + a. 
In particular, for x = 0 we have 
f'(0) =a, f'(0) = a(a — 1),..., 
f{(0O) = al — 1): (x — v + 1). 
Taylor’s theorem then states 


0tasi tart ED. 
4 Han Wem Gant Dor R, 
n! 


Convergence 


We must yet discuss the remainder. This problem is not very diffi- 
cult, but nonetheless is not quite so simple as the cases previously 
treated. We shall obtain an estimate for the remainder both directly 
and also as a special case of a general result of Section A.4. This will 
permit us to conclude that whenever |x| < 1, the remainder R, for the 
binomial expansion tends to zero. Thus the expression (1 + x)* may 
be expanded in the infinite binomial series 


(Lt afat+ Seq ay eee 


2! 
o fy ` 
=2, (7) i 
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where for brevity we have introduced the general binomial coefficients 


(*) _ ala — 1): (x — v +1) 


V 


J= 


*To prove directly that the remainder R, — 0 for n + œ in the case 
where —1 < x < 1, we make use of Cauchy’s form of the remainder 
(19), p. 448: 


(for v > 0), 


y! 


R, = (1 — 6)” at tte +I Or) 


n! 
= CL — 1)(a — 2) < (x — nar + Ox)" 
n! 


(0<0< 1). Since |z| < 1, we have 0 < (1 — 6)/(1 + 6z) < 1 so 


1 2, n 


Ral < (1 + 0x) Jaz] 
There exists a number q with |x| <q < 1. Then obviously also 


0-3} 


for all sufficiently large m, say for m > N. Thus forn > N 
[Ral < CL + Gx) Jol (1 Jal)’. 


The factor (1 + 20)*7 is bounded (by 2%! if a > 1, by (1 — q)” if 
a < 1) so that clearly R, — 0. 

A slightly more general formula gives an expression for (a + b)”. 
We only have to factor out a* and apply the binomial expansion with 
x = bja to obtain for a > 0 and |b| < a 


ators ahit) sahi tat 22=9 2)" 4...) 
a a 1-2 a 


<q 


= gt p Sgttpy p OTD aep g n. 
1 1-2 
5.6 Geometrical Applications 


The behavior of a function f(x) in a neighborhood of the point 
x = a, or the behavior of a given curve in a neighborhood of one of 
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its points, can be described in detail by means of Taylor’s theorem, 
since this theorem permits us to resolve the increment of the function 
on passing to a neighboring point x = a + h into a sum of quantities 
of the first order, second order, etc., in A. 


a. Contact of Curves 


Contact of Higher Order 


If at a point x = a, two curves y = f(x) and y = g(x) intersect and 
have a common tangent, we say that the curves touch one another or 
have contact of the first order. In this case the Taylor expansions of the 
functions f(a + h) and g(a + h) have the same terms of zero order and 
first order inh. If, in addition, at the point x = a the second derivatives 
of f(x) and g(x) are also equal to each other, we say that the curves 
have contact of the second order. Then the terms of second order 
in the Taylor expansions of f and g will also agree. If we assume that 
both functions have continuous derivatives of at least the third order, 
then the difference 


D(x) = fæ) — gl) 
can be expressed in the form 
D(a + h) = f(a + h) — g(a + h) 


h? m h? 
= 31 D (a + Oh) = a) 


where the expression F(A) tends to f"(a) — g"(a) as h tends to zero. 
The difference D(a + h) therefore vanishes to at least the third order 
with A. 

We can proceed in this way and consider the general case where the 
Taylor series for f(x) and g(x) agree up to terms of the nth order; 
that is, 

f(a) = g(a), f'(a) = g'@),..., f"™@) =. 


We assume that the (n + 1)th derivatives are continuous. Under these 
conditions the curves defined by our two functions are said to have 
contact of the nth order at the point x = a. The difference of the two 
functions is then of the form 


D(a + h) = f(a + h) — gla + h) = 
h”t! 


(n+ D! ae 


Art 


(n + 1)! 


D'*"q ae 6h) 


Sec. 5.6 Geometrical Applications 459 


where since 0 < 6 < 1 the quantity F(A) = D'"+(a + 6h) tends to 
f° (a) — g"*Y(a) as h tends to zero. We see from this formula that 
at the point of contact the difference f(x) — g(x) vanishes to at least 
the (n + 1)th order. 
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Figure 5.1 Osculating parabolas of e”. 
The Taylor polynomials defined by 
—a,, x— a)" wm 
Pœ = f(a) + == pa) + + + S= A gma) 


are characterized geometrically as the “parabolas” of the nth order 
having contact of the greatest possible order with the graph of the 
given function at the given point. Hence these parabolas are sometimes 
called osculating parabolas. (Only for n = 2 are these curves “‘parab- 
olas” in the ordinary sense.) 

For the function y = e”, Fig. 5.1 shows the first three osculating 
parabolas at the point z = 0. 
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Two curves y = f(x) and y = g(x) that have contact of the nth 
order at a point x = a, might possibly have contact of an even higher 
order, that is, that the equation f(*"(a) = g("*))(a) might also be true. 
If this is not the case, that is, if f"*)(a) ¥ g'"*)(a), we say that the 
order of contact is exactly n.? 


Contact of Even or Odd Order 


From our formulas as well as from intuition we can state a remarkable 
fact often unnoticed by beginners. Let the contact of two curves be 
exactly of even order; that is, an even number n of derivatives of the 
two functions have the same value at the point in question, whereas the 
(n + 1)th derivatives differ. Then the preceding formulas show that 
the difference f(a + h) — g(a + A) has different signs for small positive 
values of A and for numerically small negative values of h. The two 
curves then cross at the point of contact. This occurs, for instance, in 
contact of the second order if the third derivatives have different 
values. In contrast, contact exactly of an odd order, for example, an 
ordinary contact of the first order, implies that the difference f(a + h) — 
g(a +h) has the same sign for all numerically small values of A, 
positive or negative; the two curves therefore do not cross in a neigh- 
borhood of the point of contact. The simplest example is the contact 
of a curve with its tangent. The tangent can cross the curve only at 
points where the contact is at least of second order; it does actually 
cross the curve at points where the order of contact is even, for example, 
an ordinary point of inflection where f"(x) = 0 but f"(z) #0. At 
points where the order of contact is odd the tangent does not cross 
the curve, as for example, at an ordinary point of the curve where 
the second derivative is not zero, such as for the curve y = x4 at the 
origin. 

We know from Chapter 4, p. 360, that for the circle of curvature at 
the point x = a given by the function y = g(x) in a neighborhood of the 
point x = a, we not only have g(a) = f(a) and g(a) = f'(a), but also 
g’(a) = f"(a). Hence the circle of curvature is at the same time the 
osculating circle at the point of the curve under discussion; that is, it is 
the circle which at that point has contact of the second order with 
the curve. In the limiting case of a point of inflection, or in general, of 
a point at which the curvature is zero and the radius of curvature is 
infinite, the circle of curvature degenerates into the tangent. In ordinary 


1 That the order of contact of two curves is a genuine geometrical relation which 
is unaffected by change of axes is a fact which can be easily confirmed by means 
of the formulas for change of axes (see Chapter 4, p. 360). 
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cases, when the contact at the point in question is not of an order 
higher than the second, the circle of curvature does not merely touch 
the curve, but also crosses it (cf. Fig. 4.23, p. 359). 

In conclusion it should be mentioned that sometimes contact of 
order exactly m is described by saying: the curves have m + 1 infinitely 
near points in common; of course, the precise meaning of such a 
statement obviously refers to a limiting process. If the curves have, in 
fact, m + 1 distinct points P, P,,...P,, in common and if we let all 
the points P, tend to P, if necessary modifying one of the curves, then 
the limiting position might be expected to be that of two curves with 
a contact of order m. For example, if we draw a circle through three 
points P, P,;, P on a curve C and then let P, and P, tend to P, it can be 
seen that the circle tends to the circle of curvature on Cin P. (See 
Problem 4, p. 437.) 


b. On the Theory of Relative Maxima and Minima 


As we have already seen in Chapter 3, p. 243, a function f(x), whose 
first derivative vanishes at x = a, has a relative maximum at the point if 
f'(@) is negative, a minimum if f"(a) is positive. These conditions, 
therefore, are sufficient conditions for the occurrence of a maximum or 
minimum. They are by no means necessary; for in the case when 
J'(a) = 0 there are three possibilities open; at the point in question 
the function may have a maximum or a minimum or neither. Examples 
of the three possibilities are given by the functions y = —24, y = 24, 
and y = z? at the point x = 0. Taylor’s theorem at once enables us to 
make a general statement of sufficient conditions for a maximum or a 
minimum. We need only to expand the function f(a + h) in powers of 
h; the essential point is then to find whether the first nonvanishing 
term contains an even or an odd power of A. In the first case we have a 
maximum or a minimum depending on whether the coefficient of 
h is negative or positive; in the second case we have a horizontal 
inflectional tangent and neither maximum nor minimum. The reader 
may complete the argument for himself using the formula for the 
remainder.’ 


1 The necessary and sufficient condition given previously (p. 242), however, is 
more general and more convenient in applications: provided the first derivative 
f(x) vanishes at only a finite number of points, a necessary and sufficient con- 
dition for the occurrence of a maximum or minimum at one of these points is 
that the first derivative f'(x) changes sign as the curve passes through the 
point. 
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Appendix I 


A.I.1 Example of a Function Which Cannot Be Expanded 
in a Taylor Series 


The possibility of expressing a function by means of a Taylor series 
with remainder of (n + 1)th order depends essentially on the con- 
tinuity and differentiability of the function at the point in question. 
For this reason log x cannot be represented by a Taylor series in powers 
of x, and the same is true of the function x“ whose derivative is infinite 
at x = 0. 

In order that a function may be capable of being expanded in an 
infinite Taylor series, all its derivatives must necessarily exist at the 
point in question; however, this condition is by no means sufficient. 
A function for which all derivatives exist and are continuous throughout 
an interval still need not be capable of expansion in a Taylor series; 
that is, the remainder R,, in Taylor’s theorem may fail to tend to zero as 
n increases, no matter how small the interval is, in which we want to 
expand the function. 

An important simple example of this phenomenon is the function 


y = f(@) = e for «#0, f(0)=0, 


which we have already considered in the Appendix to Chapter 3, 
p. 255. This function and all its derivatives are continuous in every 
interval, even at x = 0, and as we have seen, at this point all the deriv- 
atives vanish, that is, f'")(0) = 0 for every value of n. (Geometrically, 
this means that the line y = 0 has contact of infinite order with the curve 
of the function at the point x = 0). Hence in the Taylor expansion 


FONE Oe Opes 


all the coefficients of the approximating polynomials P,(x) vanish, 
no matter what value is chosen for n. Thus the remainder remains 
equal to the function itself, and thus, except for x = 0, can not approach 
zero as n increases, since the function is positive for every other 
value of z. 


Incidentally, this function is useful for the construction of functions 
exhibiting intuitively unexpected phenomena. For example, 


e(z) = esin (1/2) 
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supplemented by g(0) = 0 is again a function with derivatives of all orders, 
all of which vanish af x = 0; the graph of y = g(x) near x = 0, intersects 
the z-axis infinitely many times, and oscillates infinitely often. 


A.I.2 Zeros and Infinities of Functions 


a. Zeros of Order n 


The Taylor expansion of a function f(x) allows us to characterize the 
order to which a function vanishes at a point x =a. We say that a 
function f(x) has an exact n-fold zero at x = a or that it vanishes there 
exactly of order n, if f(a) = 0, f'(a) = 0, f"(a) = 0... f(a) = 0, 
and f™(a) # 0. We expressly assume that in the neighborhood the 
function has continuous derivatives at least to the nth order. By our 
definition we imply that the Taylor series for the function in the 
neighborhood of the point can be written in the form 


(28) EN Z F(t) 5 ~ 6a Oi, Be. 


in which as A tends to zero the factor F(A) = n! f(a + h)/h" tends toa 
limit different from zero, namely, the value f'"(a). Hence f(a + h) 
has the same order as h” for h — 0 or vanishes to order n in the sense 
defined in Chapter 3, p. 252. 

Similarly, expanding the derivatives f'(x), f’(z),..., f(x) by 
Taylor’s theorem with the Lagrange form of the remainder, we obtain 
a series of expressions 


Ar! h”! 
, ats h 6 es r h es (n) 6h 
f'(a ) ma i(h) PE ni (a + 0h) 
(29) 
2 h-’ he~ (n) 
Gp) at = ae oh) 
(n — v)! (n — v)! 
in all of which the factors 6 may be different, whereas the factors 
Fy, Fo,..., F, tend continuously to f(a) as h > 0. Hence f'vanishes 


of order n — 1, f” of order n — 2, etc. 
In these formulas, of course, the assumption is made that f(x) 
vanishes of order n > v. 


b. Infinity of Order v 


If a function (2) is defined at all points in a neighborhood of the 
point x = a, except perhaps at z = a itself, and if d(x) = f(x)/g(x), 
where at x = a the numerator does not vanish, but the denominator 
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possesses a v-fold zero, we say that the function ¢(x) becomes infinite of 
the vth order at the point x = a. If at the point z = a the numerator 
has a w-fold zero and if u > v, the function has a (u — »)fold zero 
there; if u < v, the function has a (vy — y)fold infinity at the point. 

These definitions are in agreement with the conventions already laid 
down (cf. Section 3.7) regarding the behavior of a function. 


A.I.3 Indeterminate Expressions 


We now discuss in a more precise manner, the “indeterminate 
expressions” of the form (x) = f(x)/g(x), in which f(x) and g(x) both 
vanish at the same point x = a, such as the function (sin x)/x at x = 0. 
We shall always assign to such functions the value 


(30) g(a) = lim g(a + h) 
provided this limit exists. 

These limiting values can be characterized by a simple rule, known 
as L’ Hospital’s rule, for which we assume that all derivatives of f and g 
that arise are continuous in an interval containing a. We furthermore 
assume that the denominator g(x) vanishes at x = a to an order v not 
higher than that of the numerator f(x), so that the function $(z) does 
not become infinite at x = a. Then the rule states 


f (a) 

(31) Ha) = aa 

By the definition of continuity, the function ġ(x) is then continuous 
at x = a, and being continuous elsewhere, as long as g(x) 4 0, ¢ is 
continuous in an interval about a. 

The proof follows immediately from the results of A.2; applying 
Eqs. (28) to both fand g, we find the function ¢ is, in a neighborhood 
of a, given by the relation 


_flath)_ fa + 6h) 
ġa + h) ee g(a + h) T ga + 0h)’ 


whence the continuity of the numerator and denominator, and the 
nonvanishing of g(a) yield (31). We can express the meaning of the 
last equations in the following way: if the numerator and denominator 
of a function d(x) = f(x)/g(x) both vanish at z = a, we can determine 
the limiting value as x tends to a by differentiating the numerator and 
denominator an equal number of times until at least one of the deriv- 
atives is not zero at the point. If we encounter a nonvanishing derivative 
in the denominator before one appears in the numerator, the fraction 
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tends to zero. If a nonvanishing derivative in the numerator is met 
before one in the denominator, the absolute value of the fraction 
increases beyond all bounds. 

We thus have a method of evaluating the so-called “indeterminate 
expression” 0/0, that is, of determining the limiting value of a quotient 
in which the numerator and denominator tend to zero. 

We can arrive at our results in a somewhat different way by basing 
the proof on the generalized mean value theorem instead of on Taylor’s 
theorem (cf. p. 222). Accordingly, if g'(x) # 0 in a neighborhood of 
the point a, we have 


f(a + h) — f(a) _ f(a + 6h) 
g(a + h)— gla) g(a + Oh) 
where @ is the same in both numerator and denominator. Hence, in 
particular, when f(a) = 0 = g(a), 
fla th) _ f(a + Oh) 
g(a +h) g(a+ 6h) 
Here 0 is a value in the interval 0 < 0 < 1, and putting k = 0h, we 


obtain 
mEt mL EN 
ue g(a +h) ae (a +k) 
it being assumed that the limit on the right exists. 

If f'(a) = 0 = g(a), we proceed in the same manner until we reach a 
first index u for which it is no longer true that simultaneously f(a) = 
0 = g(a). Then 

mf th) _ fat) f(a) 
n»ogla + h) -0 g™(a+1) g(a) 
an expression in which we include the case when both sides are 
infinite. 

Examples. The following examples which are significant by them- 

selves, illustrate the application of L’Hospital’s rule. 


sinx  cosQ 


270 x 1 
li I SOS Ey 
x70 x 1 

‘wie eT 2e°* 


lim —————— = lim ————-———- = 

a0 log (1 + x) x0 1/(1 + x) 
. 1 — cosz sin x . COS TX 
lim —————— = lim = lim 


1 
2-0 x r=0 pr z»0 2 2 
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Other Indeterminate Forms. We further note that other so-called 
indeterminate forms can also be reduced to the case we have considered; 
for example, the limit of 


as x tends to zero, is the limit of the difference of two expressions both 
of which become infinite, or is an “indeterminate form” co — œ. By 
the transformation 


sing x x sin x 
we at once arrive at an expression whose limit as x tends to zero is 
determined by our rule to be 


] — cos z i sin x 
— m = lim —_—_——__ = 
z»-ozcoszx + sing eze702cosx—azsinz 


Derivatives of Indeterminate Forms 


The expressions ¢(x) = f(x)/g(x) defined at x = a by our rule are not 
only continuous but also have continuous derivatives provided that f 
and g have continuous derivatives of sufficiently high order. 

It suffices for us to establish this fact in the case where g vanishes to 
first order at a, or g(a) = 0, g'(a) # 0. For z # a, 


_ (2)f'(@) —f@g'@) _ O. 
(g(x)? N(x) 
where again, both numerator and denominator vanish at x = a, since 


f(a) = g(a) = 0. Hence we can determine the limiting value by 
applying our rule 


$ (2) 


. EA E 

l '(x)=1 ; 

ma Pa i N'(x) 
Clearly, d(N(x))/dx = 2g(x)g'(x), d(z(x))/dx = g(x)f"(x) — f(@)g"(2), 
both of which again vanish at x = a. Applying L’Hospital’s rule once 
more, as 


, 2"(x) 
l '(x) = lim ——., 
a $ (5) An N" (x) 
and noting that N”(x) = 2g(x)g"(x) + 2(g'(x))}, which does not vanish 
at x = a, we find that 
eT? g(a) f"(a) — f'(a)g"(a) 
lim ¢'(2) = =-=, 
za ? 2(g'(a))? 
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and this limit is indeed the derivative of ¢'(x) at x = a (see Chapter 
3, p. 261). 

Similar rules for indefinite forms hold for x — œ. Thus let f(z) and 
g(x) be functions for which lim f(x) = lim g(x) = 0 while lim f'(x) and 


lim g'(x) exist and are 0. Then ~ 
po) _ lin 


rom g(x) lim g'(x) 


The proof follows again from the mean value theorem of differential 
calculus. 


*A.I.4 The Convergence of the Taylor Series of a Function 
with Nonnegative Derivatives of All Orders 


We insert a general theorem concerning the convergence of Taylor’s 
expansion for functions all of whose derivatives are nonnegative. 

Consider the class of functions f(x), differentiable to all orders on the 
closed interval a < x < b, all of whose derivatives are nonnegative on 
this interval: 


Say SO, PS 


We shall show: For every such function the corresponding Taylor 
expansion of f(x + A) in powers of h converges, and the series represents 
the value of f(z + A) when x and & = x + A lie in the open interval 
(a, b) and |A| < b — x. 

For the proof we start with the observation that f'(x) > 0 by assump- 
tion and hence 


0 < f(z) — f(a) = f FE) dé 


<| ro d = f(b) — f(a) = M. 


Moreover, for x and £ = x + Ain the interval between a and b, we may 
write Gi 
f(x + h) — f(a) = hf H De A 


Assume first that h > 0, or x < & < b. Then all of the terms on the 
right-hand side are nonnegative’ and so each is not greater than the 


1 This follows for R, from the Cauchy or Lagrange formulas and the assumption 
TN > 0. 
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value of the left-hand side or than M; thus 
S e M_M 
n! h” (E= 
For  — b it follows that 
(n) 
(32) Ege 
n! (b — x)” 


Now, using Cauchy’s formula ((19), p. 448) for the remainder, we 
know there exists some 6 in the interval 0 < 0 < 1, such that 


0 < 


0 < R, = 0-0 ua i Co + 0h) 
n! 
hn + DA — 0)"M 
(b — x — Oh)”™*" 
Since § = x + h < b, we may choose a positive number p such that 
a aa or b—x—6h>h(l+p-— 8). 
1 +p 
We then have 
Mh™*"(n + D) — 6)” 
0< R, << — 
< < her + p— 6)? 
or 
OLR < M(n + 1) ( 1—6 ) <Mn+1) 1 ; 
Ce pe 0) Map p (+p) 
since 


pe oe ee a. E 
1-O0+p 1+p/(l1-—6)” 1+ p 
We know (Chapter 1, p. 70) that (n + 1)/(1 + p)” tends to zero as 
n increases, so that R, tends to zero as n increases, when0 < h < b — zx; 
thus Taylor’s series tends to the function f for h > 0. 
For negative h, the fact that R, tends to zero with increasing n 
follows by using the Lagrange form (21), p. 449, for R,: 
1 
R, Za h”t! (n+1) x — @lhbi. 
IR, me)! We A |h) 
Now ft”+® is nonnegative and hence f‘"*”) is monotone nondecreasing. 
It follows then from the estimate (32) used above that 


fog — 6 |hl) f(x) < M 
(n + 1)! (n + 1)! 7 (b — 2)" 


<1. 
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Therefore 
h yH 
b — 
and so R, tends to zero as n increases when 


0< —h<b-r. 


Thus for any point x with a < x < b, the remainder R, in the 
Taylor series for f(x + h) in powers of h will tend to zero once 
|h| < b — z and h > —(a — a). 

We note that our result is still true if we assume the inequality 
f(x) > 0 only for all sufficiently large v, say for v > N for some 
integer N, whereas when v < N the sign of f(x) may be arbitrary. To 
prove this we need only replace the function f in our proof by the 


function . 

g(x) = f(z) + M(x — a +1), 
for M some positive constant. Then g(x) = f(x) > 0 for v > N, 
and g(r) = f(z) + MN(N — 1): (N— rv + Ixa + 1) 
F(x) + M for »< N. Thus g(x) > 0 for all v if M is chosen 
sufficiently large. This proves that g(x) can be expanded in powers of x, 
and the same result follows then for the function f, which differs from 
g only by a polynomial. 

The theorem on the binomial series (p. 456) is an immediate con- 
sequence of this result: We change the notation slightly and consider 
first the function ġ(x) = (1 — x)* in place of (1 + x)*. The derivatives 
of ¢ are then given by 


+ 


gw = (-1"(“)a = ary 
V 
Since the binomial coefficients 


(*) aa — 1) -*-(a—v+1) 

v y! 

have alternating signs as soon as æ — v is negative, we see that either 
the function ¢(x) or —ġ(x) belongs to the class of functions with non- 
negative derivatives from some order on when we limit x to values 
x <1. Thus for a = —1, b=1, x=0, and |A| <b —2x=1 our 
general theorem proves that 


(1 — h} = So- n'(*). 
v=0 Vv 
If here we write x for —A, we obtain the binomial expansion 
alax 


= —1) a a(x—I)(a—2) 3 
| + r) = (e =1+ ee 
sia G EE 1-2-3 


for any exponent v. and any x with -I1 <x < 1. 


+ see 
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Appendix II Interpolation 


*A.II.1 The Problem of Interpolation. Uniqueness 


The Taylor polynomial P,(z) approximates the function f(x) in 
such a way that the graphs of f(x) and P,(z) have contact of order n 
at a point a, or in such a way that f(x) and P,(x) coincide at n + 1 
points “infinitely near’ to a. We might “resolve” the point with 
abscissa a into n + 1 distinct points with abscissas £o, x,,..., 2, and 
seek an approximation to f(x) by a polynomial (x) of degree n which 
coincides with f(x) at these points. This polynomial, as it turns out, 
is determined uniquely by a system of linear equations. By a passage to 
the limit z,;-> x for all i we regain the Taylor polynomials. But 
“interpolation,” that is, the approximation by polynomials coincid- 
ing with f(z) in distinct points is of great importance in many appli- 
cations. The following discussion will give a brief account of the theory 
of interpolation. 

We consider the following problem: Determine a polynomial 
p(x) of nth degree, so that it assumes at n + 1 given distinct points 
Xos Tis - - -s Ens then + 1 given values fo, fis - - - > Jns that is, 


(Xp) = fo; p(z) = fı tosien Plta) = fo 

If the numbers f, are the values f; = f(x;) assumed by a given (possibly 
less elementary) function f(x) at the points z,, then the polynomial 
g(x) will be named the interpolation polynomial of nth degree of the 
function f(x) for the points 2, %,..., En: 

There can at most be one such polynomial of nth degree, for if there 
were two different such polynomials ¢(x) and y(x), then their difference 
D(x) = (x) — y(x) would be a polynomial of mth degree with 
0 <m <nhavingn + ! distinct roots, which is not possible according 
to elementary algebra.’ 

We can prove the uniqueness of the interpolation polynomial by yet 
another method, based on the 


GENERAL THEOREM OF ROLLE. Zf a function F(x) has continuous 
derivatives of order up to n in an interval, and vanishes at least atn + 1 


1 For we would have 
D(x) = cux — x(t — T3)... (x — Em), Co Æ 0, 
since 2,,..., T, are zeros of D(x); but then since D(z,) = 0, 
Colo — L,)(TFo — T3)... (Lo — Tm) = O 


contrary to the distinctness of £o, 21,..., Cm. 
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distinct points £o, £), . . . , 2%, of the interval, then there is a point & in the 
interior of the interval for which F‘™(£) = 0. 

Proof. The general theorem follows easily from the special case 
n = | which is the Rolle theorem proved on p. 175. Let the numbers 
%q,,..., p be arranged in increasing order. Then by the mean value 
theorem (or by Rolle’s theorem) the first derivative F’(x) must vanish 
at least once within each of the n subintervals (x;, x,;,,). This same 
consideration applied to F’(x), and the intervals between its zeros tells 
us that F"(x) vanishes at n — 1 points; by applying this argument 
repeatedly, the assertion 1s proved. 


We now apply this theorem to the difference 


F(2) = D(a) = $(2) — (x) 
= dg" + dha" +-+ + d, 


which by assumption vanishes at n + 1 points. We obtain a point ¢& at 
which the nth derivative vanishes; D™@(E) = 0. This is, however, 
n! do, so that dọ = O and the difference is a polynomial of at most degree 
n — l, vanishing at n + 1 points. Again applying the theorem of 
Rolle, we obtain d, = 0, etc., or D(x) is identically 0 as we asserted. 
These considerations can be extended to the case where the x, are 
not all distinct from each other and, perhaps, r of the values 2; 
agree; that is, zo = x, = '+ + = 2,_,. In the interpolation problem we 
shall then require that ġ(x) and the derivatives ¢'(x),..., P(x) 
should assume preassigned values for x = x», and correspondingly for 
the other points z,. The polynomial D(x) then is of the form 
C(x — x)(x — x,):: +. The general theorem of Rolle and the unique- 
ness theorem, as well as the proofs, hold unchanged in this case. 


A.II.2 Construction of the Solution. 
Newton’s Interpolation Formula 


We shall now construct an interpolation polynomial ¢(z) of nth 
degree, such that $(%) = fo,..., $(@,) = fa. In order to construct it in 
a stepwise manner, we shall begin with the constant fy which is a 
polynomial ¢,(x) of Oth order which for all x and, in particular, for 
x = x assumes the value Ay = fo. To it we add a polynomial of first 
order, vanishing for x = x and therefore of the form A,(z — 2); 
then we determine A, such that the sum has for x = x, the correct 
value fı. The resulting polynomial of first degree we name ¢,(2). 
Now we add to ¢,(x) a polynomial of second order which vanishes for 
x = x and x = x,, and is thus of the form A,(x — x )(x — x), whose 
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addition thus will not change the behavior at these two points; the 
factor A, is then determined so that the resulting polynomial of second 
order, ¢,(zx), will also take the assigned value, in this case fy, at 2 = 2p. 
This procedure is continued until all points are reached and we 
obtain the polynomial 


(33) (x) = $, (£) = Ao + A (£ — 2) + A(x — mx — x) +°°° 
+ A, (x =R To) A (x K Tn): 


Our method of obtaining the coefficients A, in the expression for ¢ is 


made clear by substituting x = 2, x = 2,,..., = x, in order, thus 
obtaining the system of n + 1 equations 
So = Ao 


(34) fy = Ao + A£; — 2) 
Ja = Ao + Ailt — To) + Ag(Xe — Lota — 2) 
Ja = Ao + Al£n — 2o) He + 
T A,(x, as Lo)(Xp ot xı) van (x, = Yai): 
Clearly, we can determine the coefficients Ao, A;,..., A, successively 


so as to satisfy these equations, and in this way the interpolation 
polynomial can be constructed. 


When the values x, are equidistant, x, = x,_, + A, the result can be written 
explicitly in a more elegant manner. The equations for the 4; now become 


fo = Ao 
f = Ao + hA 
(35) fh = Ao + 2hA, + 2! hk?A, 


fs = Ao + 3hA, + 3 - 2h? Ap + 3! h’ A, 
n! 
(n — i)! 
The solutions may easily be expressed as successive differences of f: 


Given any sequence (finite or infinite) of terms fo fi, fo,..., we call the 
expressions 


Afo =fi — fo Afi =h — fi Afo = fs — fo... 


the first differences of the f,. Applying the differencing process again to the 
sequence of Afp, we obtain the expressions 


AR = Afi — Af, AA=AL—AG, Ah = Ah Afas 


fn = Ag +nhA4 +: + hA; +e +nth"A,. 
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that is, 
Ah = f= 2fi + fo Ai =f — fh tfo, 


which are the second differences of the fẹ. The nth difference A", is defined 
recursively as A”~*f,,, — Alf. When expressed directly in terms of the fy 
it is given by the formula 


(36) A" fy = fryn = (e pI (3) faves oa REESE (— f 


which follows by a simple inductive argument left to the reader. With this 
terminology the coefficients A, can be written in the form 


1 
(37) A, = F hA 
as can be verified by induction.? 


Newton's Interpolation Formula. Putting £ = (x — x9)/h we have x — x, = 
h(§ — r). The expressions (x — x(x — 2,)-++-(« — x,) assume then the 
form &(& — 1)---(§ — v)h*t!. Thus we obtain for the polynomials (x) 
from (33), (37), Newton's interpolation formula: 


Š 3 3 
He) = Hey ED = fo + (I G o N 


If fo. fis fos --- are the values of a function f(x) at the points 79, £i, 7,..., 
where f has continuous derivatives through the nth order, then A‘/,/h’ is an 


1 We have to verify that the values A, given by (37) satisfy the equations (35); 
that is, for any sequence fo, fi, fo,..., the identity 


hah (1) % (3) ax Joh 


is satisfied. Assuming that this is true for a certain k, we must show that 


k k 
fra =f + (i) Afi + (;) Af = eau 
k | k 
= (fy + Afo) + (‘Jer + A*f) +( Jar +A: 
k+l k+l 
=fo+( 1 ES > ayer 
which is the identity for the case k + 1. 


2 As on p. 457 we define here the bionomial coefficients (i) for general and 


position integers k by (i) = (E = 1)--- (E-k + 1k! 
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approximation to the derivative f(x); we shall show on p. 476 that 


1 
lim — Af = f(a). 
h—-0 h 


: dé: (x — 2p)" 
im ( = —— 
k 


! 
h—0 k . 


we see that in this case ¢(x) tends to the Taylor polynomial P,,(x) when A tends 
to zero. 


Since also 


We note that the construction of the interpolation polynomial is 


possible in the same manner, if, perhaps, the first r values £o, ..., 2,1 
coincide, and corresponding values fo, fy, .-.-, ff} are preassigned 
for P(X), & (Zo), . . . » (aq), which coincide with the values 


I (%0), f (£0), «0 , [1P (2), 


for a given function f. For (x) we write the form 


P(x) = Ao + A€ — T0) + A(x — Xp)? 
ttt A(t ~ Xo)" + Ap a(% to) E — 2%) +5 
we then determine the A, in order from the equations 
fo = Ap fo =A, fo = 2A; 
fore = (r — 1)! A, 
f, = Ao + AlE, — 2o) H't + A(t, — Xo)’ 
Seri = Ao + Arna — %) + °°" 


+ A(X, 41 — %)" + Arara — T Era — r) 


A.Ii.3 The Estimate of the Remainder 


For the foregoing considerations it did not matter how the values 
Jo fis- - -Jun were originally given. For instance, if these values were 
obtained from physical observations, the problem of constructing the 
interpolation polynomial could still be completely solved, giving us 
then in ¢(x) a simple smooth function defined for all x and taking the 
observed values at the given points, which can be used to “predict” 
approximate values for f(x) at other x. However, if the function f(x) 
taking the n + 1 given values f, at the given points z, is defined also 
for intermediate values x, we have to face the new problem of estimating 
the difference R(x) = f(x) — ¢(x), the error of interpolation. We 
know at first only that R(x) = R(x) = +++ R(z,) = 0. In order to be 
able to say more, we must make further assumptions on the behavior 
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of the function f(x), which affect the remainder R(x). We will therefore 
assume that in the interval under consideration f(x) has continuous 
derivatives of at least the (n + 1)th order. 

We note at first that for every choice of the constant c, the function 


K(x) = R(x) — e(x — a(x — 4) +++ (x — 2,) 


vanishes at the n + 1 points %,...,2,. Choose now any value y 
distinct from zg, 2,,...,2,. We can then determine c so that K(y) = 0, 
that is, 
—_ R(y) 

(Y — TAY — HM) (yz, 
Then there are n + 2 points at which K(x) vanishes. We apply the 
generalized Rolle’s theorem used earlier to K(x); by this we know there 
is a value x = & between the largest and the smallest of the values 
Xo, Tis- - -3 p, Y, Such that KIHE) = 0. Since R(x) = f(x) — d(x), 
and ¢, as a polynomial of nth order, has an identically vanishing 
(n + 1)th derivative, we have 

JE — en +)! =O, 
noting that (n + 1)! is the (# + 1)th derivative of (x — x9) ++ +(x — 2,,). 
Thus we have obtained for c, a second expression ¢ = f'"*)(€)/(n + 1)!, 
containing & and depending in some manner on y. We now use the 
equation K(y) = 0, in which y is completely arbitrary and therefore 
can be replaced by x, and obtain the representation 


(x — Xo)(x m xı) a (x a Tp) (n+1) 
38 R(z) = = M CE), 
(38) (x) m4 J” (8) 
where & is some value lying between the smallest and the largest of the 
points z, £o, Vis. >., Xp. 
Thus the general problem of interpolation for a given function f(x) 
is completely solved. We have for f(x) the representation 


(39) f(x) = Ao + A(x — £o) + Ale — Tox — z1) +: 
efe A,(x Xo)(x = 2) e (x = x1) + R,, 
where the coefficients Ay, A,,..., A, can be found successively from 


the values of f at the points £o, 7,,...,2, by the recursion formulas 
(34) on p. 472 and where the remainder R, is of the form 


= (x — Xy)(x a. 1) cies (x B Tn) (n+1) 
(40) R, = aD fe"), 


with a suitable number & between the largest and smallest of the values 
L, %g, Ly 20+ 5 Lp: 
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If we take the corresponding formula (39) for f(x) with n replaced 
by n — | and subtract, we obtain 


A{« — Tox — x1) (£ — £a) + Rn — R,_, = 0. 


For x = x, we have R,, = 0, and hence for the coefficient A,, (using (40) 
with n replaced by n — 1) the representation 


1 iol 


í n! 


where é lies between the smallest and largest of the values x, 1,..., 2p. 
Similar representations exist for A,_1, A,_2,..., Ag. Thus we recognize 
that if the points 2, 7,,...,2, are tending together to one and the 
same point, perhaps the origin, then our interpolation formula (39) 
goes term for term into the Taylor formula (27a), p. 452, with the 
Lagrange form (21), p. 449, of the remainder. The Taylor formula can 
thus be considered a limiting case of the Newton interpolation formula. 

This formula enables us to give precise meaning to an expression 
commonly used in geometry. The osculating parabola which meets 
a given curve at a point, of nth order, is said to have “(nm + 1) consec- 
utive points in common” with the given curve at the point. Actually, 
we obtain this osculating parabola if we find a parabola having n + 1 
points in common with the curve, and then draw these points together. 
Analytically, this just corresponds to the transition from the inter- 
polating to the Taylor polynomial. In the same fashion we can 
characterize the osculation of arbitrary curves. For example, the 
circle of curvature is that circle which has three consecutive points 
in common with the given curve. 

The interpolation formula can be expected to give the values of a 
function whose values at some definite points are known, with a high 
degree of accuracy between these points (both | f'*1)(&)| and the 
|e — a,| are then bounded). If the value x lies outside the intervals of 
the points £o, £1, . . . , £a, we speak of extrapolation. By means of such 
an extrapolation we shall obtain good agreement provided the point x 
is sufficiently near the given points. The Taylor formula corresponds 
in a sense to complete extrapolation; in general, it is suitable for use 
only in a neighborhood of a point. 


A.II.4 The Lagrange Interpolation Formula 


In closing, we solve the interpolation problem by a somewhat differ- 
ent formula, due to Lagrange, and differing from Newton’s inter- 
polation formula insofar as each individual term contains only one of 
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the given values of the function. Moreover, the formula gives (x) quite 
explicitly, not requiring any solution of recursive formulas. For 
brevity we introduce the polynomial of (n + 1)th degree 


p(x) = (£ — Toz — x1): ` (£ — 2,), 


corresponding to the given points z,. Differentiating by the product 
rule and substituting then successively for x the values 2%,...,7,, We 
obtain the relations 


Y (£o) = (£o — 2) — Ta) *** (£o — Fy) 
y (x,) = (zx, =" To) aii (x, a Tı) (x, oe 41) oe (x, ree tA, 


y (z,n) = (x, — LoT, a xı) a (z, = Ep1). 


We note that 


p(z) ae (x — zo): (£ — %,_1)(% — X41) ti (£ — Tn) 
(x e+ x,y (x,) (x, TA To) aa (z, ao r,_1)(X, — 41) poe (x, a Xp) 
is a polynomial of nth degree, having at the point x = zx, the value 1, 


and at the remaining points x, the value 0; then it is immediately 
clear that the expression 


(41) G(x) = y(x) 


fo fi Ja l 
f = Lo) p (Xo) (x oe x, )p'(%) (x a x,y (Tp) 


is the desired interpolation polynomial. This is the interpolation 
formula of Lagrange. 


PROBLEMS 


SECTION 5.4b, page 540 


1. Give the complete formal derivation of the remainder formula (27), 
p. 452, using mathematical induction. 


2. (A Variant of Proof of Taylor’s Theorem) 

(a) If g(h) has continuous derivatives through the (n + 1)th order for 
0 <h <A, and if g(0) = g'(0) = : - - = g™(0) = 0, while |g"*(h)| < M 
on [0, A], for M a constant, show that |g'(h)| < Mh, |g" (A)| < 
Mh?/2!,... |g (A) < Mhi/i!,...,lg(@| < MA"/n!, for all h in the 
interval. 
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(b) Let f(x) be a sufficiently differentiable function on a < x < b, and T,,(h) 
be the Taylor polynomial for f(x) at x =a. Apply the result of (a) to the 
function g(h) = R, = f(a + h) — T,(h) to obtain Taylor’s formula with a 
rough estimate for the remainder. 


3. Let f(r) have a continuous derivative in the interval a < x < b, and 
let f”(x) = 0 for every value of x. Then if € is any point in the interval, the 
curve nowhere falls below its tangent at the point x = , y = f(é). 

(Use the Taylor expansion to three terms.) 


4. Deduce the integral formula for the remainder R, by applying inte- 
gration by parts to 


h 
f(@ +h) — f(*) -| fœ +7) dr. 
0 
5. Integrate by parts the formula 
h 
Ra = l (h = fOe + 7) dr, 


0 
and so obtain 


h” 
R, =f +h) — f) — hf) mS hE). 
*6. Suppose that in some way a series for the function f(x) has been 
obtained, namely 
fŒ) = ag + aye + ag? +--+ + a,x" + R,(2), 


where do, 4, ..., @, are constants, R,,(2) is n times continuously differentiable, 
and R,(7)/e"-—- 0 as x— 0. Show that a, = (f*(0)/k!) (k =0,...,n), 
that is, that the series is a Taylor series. 


SECTION 5.5, page 453 


1. Find the first four nonvanishing terms of the Taylor series for the 
following functions in the neighborhood of x = 0: 


(a) x cot x (d) en z 
(6) —— (e) e” 
V 
(c) secx (f) log sinx — log x. 


2. Find the Taylor series for arc sin x in the neighborhood of x = 0 by 


using 
, ii dt 
arc sins =| —=———., 
0 v] PEZ t? 

Compare Section 3.2, Problem 2. 

*3, Find the first three nonvanishing terms of the Taylor series for sin? x 
in the neighborhood of x = 0 by multiplying the Taylor series for sin x by 
itself. Justify this procedure. 


*4. Find the first three nonvanishing terms of the Taylor series for tan x 
in the neighborhood of x = 0 by using the relation tan æ = sin z/cos x, and 
justify the procedure. 
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*5. Find the first three nonvanishing terms of the Taylor series for Vcos x 
in the neighborhood ‘of « =0 by applying the binomial theorem to the 
Taylor series for cos x, and justify the procedure. 


*6. Find the Taylor series for (arc sin x)*. Compare Section 3.2, Problem 2. 
7. Find the Taylor series for the following functions in the neighborhood 
of x =0: 


(a) sinh! x. (b) [ 


v0 


zT Tai 
į sint 
edt. (o) Í — dt. 
o f 
*8. Estimate the error involved in using the first n terms in the series in 
Problem 7. 


9. The elliptic function s(u) has been defined (Section 3.14a) as the inverse 
of the elliptic integral 
: dx 
u(s) = weasel 
0 Vv (l = x?)(1 s k?x?) 
Find the Taylor expansion of s(u) to the term of degree 5. 


10. Evaluate the following limits: 


x . lyx? 
(a) imz (1 +}) ~e]; (d) lim (=) , 
w— æ T z—0 T 
b) li e 2| f] a} \ li = PAi 
O im ge tat (1 +i) -ej © im (F5) 
f : 1\" 1\" 
(c) im =| (1 +1) — e log ( Hi, 
ses x x 


*11. Find the first three terms of the Taylor series for [1 + (1/x)} in 
powers of I/.. 


8 


*12. Two oppositely charged particles +e, —e situated at a small distance 
d apart form an electric dipole with moment M = ed. Show that the potential 
energy 

(a) At a point situated on the axis of the dipole at a distance r from the 
center of the dipole is (M/r*)(1 + e), where «e is approximately equal to 
d?/4r?. 

(b) At a point situated on the perpendicular bisector of the dipole is 0. 

(c) At a point with polar coordinates r, 0 relative to the center and axis of 
the dipole is [M cos (0/r?)](1 + «), where e is approximately equal to 


(d?/8r?)(5 cos? 0 — 3). 


(The potential energy of a single charge q at a point at a distance r from the 
charge is qg/r; the potential energy of several charges is the sum of the potential 
energies of the separate charges.) 


SECTION 5.6, page 457 


1. Prove if f(a) = 0 and f(x) has sufficiently many derivatives at x =a 
that f(x)” has at least an (n — 1)th order contact with the x-axis. 
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2. The curve y = f(x) passes through the origin O and touches the x-axis at 
O. Show that the radius of curvature of the curve at O is given by 


*3. K is a circle which touches a given curve at a point P and passes 
through a neighboring point Q of the curve. Show that the limit of the circle 
K as Q — P is the circle of curvature of the curve at P. 


*4. Show that the order of contact of a curve and its osculating circle is at 
least three at points where the radius of curvature is a maximum or minimum. 


*5. Show that the osculating circle at a point where the radius of curvature 
is a maximum or minimum does not cross the curve unless the contact is of 
higher than third order. 


*6. Find the maxima and minima of the following functions: 
(a) cos x cosh x (b) x + cosx 


*7, Determine the maxima and minima of the function y = e~!/** (see 
p. 242). 


SECTION A.3, page 464 


1. Prove if fis continuous on the interval [0, 1} that 


1 
Jim 2 i dz = f(0). 


r—0 x 


2. Prove that the function y = (27)", y(0) = 1 is continuous at x = 0. 


6 


Numerical Methods 


The task of solving an analytical problem always remains uncom- 
pleted. The proof of the existence and of some basic properties of the 
solution is usually considered satisfactory, but relevant questions 
always remain to be answered. Thus, when the solution is defined by a 
limit process, for example by an integral, the problem arises of actually 
finding approximations to this limit and of estimating the accuracy of 
these approximations. Not only are such questions of basic importance 
theoretically but they are also inevitable, if we wish to apply analysis 
to the description and control of natural phenomena which in principle 
can be described only in an approximate manner. 

Accordingly it is a great challenge to carry the solution to the point 
where numerical answers and estimates of their accuracy come into 
reach. 

Recently, with the advent of high-speed automatic computing 
machines, theoretical and practical aspects of “numerical analysis” have 
received a great stimulus; they are presented in a variety of textbooks.! 
For centuries, however, many of the foremost mathematicians, such as 
Newton, Euler, and, in particular, Gauss, have greatly contributed to 
numerical methods. 

In this volume we cannot present numerical analysis in a com- 
prehensive way, but at least we shall discuss some of the simple classical 
results. 


1 See for example, Hildebrand, Introduction to Numerical Analysis, McGraw-Hill 
Book Co., 1956; Householder, Principles of Numerical Analysis, McGraw-Hill 
Book Co., 1953; and Whittaker and Robinson, The Calculus of Observations, 
Blackie and Sons, Ltd., 1929. 
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6.1 Computation of Integrals 


Although the existence of the integral of a (continuous) function is 
assured by the theory of Chapter 2, the evaluation of such an integral 
or “quadrature”! cannot be effected by elementary functions except in 
relatively rare cases. We must therefore devise methods for numerical 
integration and for estimating the accuracy of the numerical approxi- 
mation. 

To compute approximately the integral 


(1) J =| so dx 


with a < b, we subdivide the interval a < x < b into n equal parts, each 
of length h = (b — a)/n by means of the n + 1 points 


(2) x, =a + vh, nh = b— a, a 0s C E 
Then 


where 


(3) jie Í eae 


the problem of computing the integral J is reduced to that of obtaining 
good approximations for the areas J, of strips of width A into which we 
have dissected the entire area, represented by J. 


a. Approximation by Rectangles 
The most direct approximation, paraphrasing the original definition 
of the integral, yields the relation 
n 
Teri, 
y=1 
wh hi + fet E Sa), 
where for abbreviation we set 
fy = f(z). 
1 The word “quadrature” indicates the process of “squaring”, that is, of measuring 


an area inside a curve by finding a square having the same area (as in the problem 
of ‘squaring the circle”). 
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Here (and throughout this chapter) the symbol ~ means “approxi- 
mately equal.” | 

To estimate the accuracy or “error” of this approximation, we 
assume that f(x) is continuous with a uniformly bounded derivative on 
the intervala < x < b: |f'(x)| < Mı. Then it can be proved easily (see 
Problem 4 p. 507, 6.1) that 


2 
(4) J = a sO, 
or therefore 7 M 
| a h f, < n —— 
v=1 2 
(5) = $M,(b — a)h. 


Thus the accuracy of the approximation of the integral by the finite 
sum is of the order A of the “mesh width” in the terminology of Chapter 
35: Pp. 252. 


b. Refined Approximations—Simpson’s Rule 


A better approximation is obtained with hardly more effort if we 
approximate the areas J, not by rectangular strips but by the slender 
trapezoids, as in Fig. 6.la. The approximation formula (trapezoid 
formula) is then 


CSG te Lees 7) 
6 Siea : (fo + Sn) 


since every function value except the first and the last appears twice. 

An approximation which is generally slightly more precise than that 
of the trapezoid formula is that in which the vth strip is approximated 
by a trapezoid bounded above by the tangent to the curve at the 
midpoint x, , + h/2 of the interval x, < x < x,. The area of this 
trapezoid is simply 


h 
hf = Va hf (2-1 + 2) ’ 


and we obtain by addition the tangent formula, 


(7) J ~ h(fi + fale + °° * + fien). 
As we shall see on p. 486, the accuracy of this approximation is of order 
k? when the second derivative of fis continuous in the intervala < x < b 
and |f"(x)| < M,, with some constant bound M,. 

Finally, we mention the famous approximation of Simpson, which 
with little additional effort yields a much more accurate approximation 
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XQ Xy—=1 Xp Xn—1 Xn 
(b) 
Figure 6.1 (a) The trapezoid formula. (b) The tangent formula. 
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Xy =] Xy Xy+ x 
Figure 6.2 Simpson’s rule. 


if the fourth derivative of f exists and is uniformly bounded in the 
interval: 


If”) < M,, 
with M, a constant. Simpson’s formula for n = 2m is 


(8) J SEG aa te e E 


HERERHR H + Sens) +5 (So + San) 


The formula is easily obtained if we approximate the region composed 
of the yth and (v + 1)th strips by a strip of width 2h bounded above by 
the parabola which agrees with f at the three abscissae 2,_,, 
x, =a, ,+h,and2,,, = 2,_, + 2h (see Fig. 6.2). Newton’s inter- 
polation formula (p. 473) yields the equation of this parabola: 


y = fa F (x a %y_y) hah 


(x os Lyx — T1 — h) foi 73 2f, +f : 


7 2 h? 


hence we have the approximation 


Tye Ly_y+2h 
Jy + J yar ~f y dx =| y dx 


gh — 2h 
2 


y-1 v—1 


= 2hf,_1 + 2h(f, — fia) + fiir — 2 + fri) 


= : EE ET E 
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The formula is now obtained for even n = 2m by adding all these 
approximate values for y = 1, 3,4,..., 2m — 1 or all the areas of the 
pairs of strips. 


* Accuracy 


It is not difficult to estimate the accuracy of our approximations. 
Each quadrature proceeds by approximating the function f(z) in an 
interval by an easily integrated function ¢(z) (a polynomial). An 
estimate for the error in the integration formula can thus be 
obtained by estimating | f(x) — $(=)|. 

In the tangent formula (p. 483) we replaced f(x) in the interval 
[z,_1, 2,] by its tangent at the midpoint x, — (h/2), that is, by 


terest) (entire) 


By Taylor’s theorem with Lagrange’s form of the remainder 
1 Ye oa 
J(=) = PaE TEE AS); 


where ¢ lies between x and z, — h/2. Hence the error corresponding to 
one strip is estimated by 


|J, = hf,- = 


Í "gssai 


Ty— 


<Í ” IŒ = KDl dz < M; [7 He- 4t) a 


Ty Ly—h 2 


For the total error in the tangent formula contributed by the various 
intervals’ we find then the upper bound 


1 This is the total error inherent in using the approximating formula, the so-called 
truncation error; in practice, additional error arises because of round off in the 
computation. The total effect of round-off errors increases most likely with the 
number of steps taken, that is, with decreasing 4, whereas the truncation error 
decreases. 
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We use this derivation as a model for estimating the error in the other 
quadrature formulas. In the trapezoidal rule (6) we approximate f(x) in 
the interval [z,_,, x,] by the linear interpolation polynomial 


pa) = fa + (x — 2 ayer ; 


From the error estimate for the remainder in the interpolation formula 
[see p. 475, Eq. (40)] for n = 1, we find 


f(x) — p(x) = (z — zyx — x) f"(§), 


where ¢ lies between x,_, and z,. Hence the absolute value of the error 
in the computation of J, is at most 


3 


M: | © [Re = 24x — x) de = z M», 


and the total error is then at most n times this quantity: 


h? 
3 M.(b — a). 

The same technique can be applied to Simpson’s rule (8), taking for 
g(x) the quadratic polynomial agreeing with f in the points x,_,, £, 
x,,, leading to an error in J, + J,,, of the order i‘. Actually, however, 
the error estimate can be improved by one order of magnitude by 
using a cubic polynomial ¢(x) that gives a better approximation to f 
in the interval [z,_;, z,,,] than the quadratic one, and still has the same 
integral, thus leading to the same approximation formula (9) for the 
integral J. We simply use the interpolation polynomial which agrees 
with f(x) at the points z,_;, 2,, x,,, and for which ¢(a,) = f (œ); it 
has the form l 


p(x) aa Ay T A(x a £pi) T A(x = £, (x EH x) 
+ Alx — z,)(£ — x,)(£ — 2,41). 


Here the first three terms represent the quadratic interpolation poly- 
nomial agreeing with f at the three points x,_,, £, %,4ı: The constant 
A, has to be determined from the condition ¢'(x,) = f'@,). 

The last term 


Ax — x, + h)(x — x,)(x — x, — h) = Agl(x — x, — h°]: [x — x] 


obviously is an odd function of x — x, and therefore does not con- 
tribute to the integral between the limits x, — h and x, + h. For the 
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error in the approximation to f we then have the estimate [cf. (40), 
p. 475, with n = 3 and with two of the interpolation points coincident 
at x]. 


egs G ~ 2, E E T G 


This yields for the error in the computation of J, + J,,., the estimate 


and hence for the total error the estimate 
-— M, = L — a)M,. 
180 
Naturally, we may attain higher accuracy by approximation of the 
function f(x) in a strip by a polynomial of a still higher order. 
Examples. We apply these methods to the calculation of 


* da 


1 


log, 2 = 


Dividing the interval 1 < x < 2 into ten parts of length h = b, and 
using the trapezoidal rule (6), we obtain 


zı= 1.1 fı = 0.90909 
za =1.2 fy = 0.83333 
z= 1.3 fa = 0.76923 
a= 14 fy = 0.71429 
zy =1.5 fy = 0.66667 
%=1.6 fa = 0.62500 
t,=1.7 fy = 0.58824 
x= 1.8 fa = 0.55556 
=19 fy = 0.52632 


Sum 6.18773 


% = 1.0 tfo = 0.5 
tip = 2.0 $f = 0.25 


6.93773 * % 
log, 2 0.69377. 


Since the graph of the integrand function has its convex side turned 
towards the z-axis, this value is too large. 
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Using the tangent rule (7) we have 


t+ th = 1.15 
Xə + 3h = 1.25 
xa + th = 1.35 
xı + th = 1.45 
t+ lh = 1.55 
Xe + th = 1.65 
xı + th = 1.75 
Xg + th = 1.85 
Xo + 3h = 1.95 


fira = 0.95238 
faia = 0.86957 
fsa = 0.80000 
fara = 0.74074 
fois = 0.68966 
fia = 0.64516 
fizia = 0.60606 
fisia = 0.57143 
fizia = 0.54054 
fias = 0.51282 


6.92836 - i% 


log, 2 = 0.69284, 


which, owing to the convexity of the curve, is too small. 
For the same subdivision we obtain a much more precise result using 


Simpson's rule (8). We have 


x, = 1.1 
Cs 3 
x, = 1.5 
kam h7 
ty = 1.9 
X = 1.2 
xı = 1.4 
z = 1.6 
zg = 1.8 
% = 1.0 


f, = 0.90909 
fs = 0.76923 
f, = 0.66667 
fy = 0.58824 
fy = 0.52632 


Sum 3.45955 - 4 


13.83820 
fa = 0.83333 
fa = 0.71429 
fa = 0.62500 
fa = 0.55556 


Sum 2.72818 : 2 


5.45636 
13.83820 


fo = 1.0 


ry = 2.0 fy = 0.5 


20.79456 - 3% 


log, œ 0.69315. 


In reality 


log, 2 = 0.693147.... 
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6.2 Other Examples of Numerical Methods 


a. The ‘‘Calculus of Errors” 


The “calculus of errors” is simply a numerical application of the 
basic fact of differential calculus: a function f(x) which is differentiable 
a sufficient number of times can be represented in the neighborhood of 
a point by a linear function with an error of higher than the first order, 
by a quadratic function with an error of higher than the second order, 
and so on. 

Consider the linear approximation to a function y = f(x). If 
y + Ay = f(x + Az) = f(x + h), we have by Taylor’s theorem 


Ay = hf'(2) + =") 


where £ = x + 6h(0 < 6 < 1) is an intermediate value which need 
not be more precisely known. If = Az is small, we obtain the practical 
approximation 


Ay + hf (x). 


Thus we replace the difference quotient by the derivative to which it is 
approximately equal, and the increment of y by the approximately 
equal linear expression in A. 

This simple fact is used for numerical purposes in the following way. 
Suppose two physical quantities x and y are related by y = f(x). We 
then ask what effect an inaccuracy in the measurement of x has on the 
determination of y. If instead of the “true” value x we use the in- 
accurate value x + h, then the corresponding value of y differs from 
the true value y = f(x) by the amount Ay = f(x + h) — f(z). The 
error is therefore given approximately by the above relation. 

We illustrate the usefulness of such linear approximations by 
examples. 


Examples. (a) Ina triangle ABC (cf. Fig. 6.3) suppose that the sides 
b and c are measured accurately, whereas the angle « = z is only 
measured to within an error |Az| < ô. What is the corresponding error 
in the value of the third side y = a = Vb? + c? — 2bc cosa? 

We have Aa ~ (bc sin « Aa)/a; the percentage error is therefore 


100Aa _ 100 be 


a a? 


sin « Aa. 
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In the special case when b = 400 meters, c = 500 meters, and « = 60°, 
we have y = a = 458.2576 meters, so that 


_ 200000 


458.2576 ” 23A 
If Ax can be measured to within 10 seconds of arc, that is, if 
Ax = 10” = 4846 x 10-8 radians, 


we find that at worst 
Aa = 1.83 cm; 


thus the error is at most about 0.004 9%. 


Figure 6.3 


(b) The following example illustrates the usefulness of the lineari- 
zation for physical problems. 

It is known experimentally that if a metal rod has length /, at tem- 
perature fo, then at temperature t its length will be / = /,(1 + a(t — to)), 
where « depends only on /, and the material of which the rod is com- 
posed. If now a pendulum clock keeps correct time at temperature to, 
how many seconds will it lose per day if the temperature rises to 4? 

For the period T(/) of oscillation we have (see p. 411) 


T() = anf! ; 
g 


hence 


If the change of length is A/, the corresponding change in the period of 


oscillation is 
Mes Al 


AT x =, 
Vlog 
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where /, = I)(1 + a(t, — fo)) and Al = al(t,; — to). This is the time 
lost per oscillation. The time lost per second is A7/T œ Al/2/,; hence 
in one day the clock loses 43,200 A//l = 43,200 a(t, — to) seconds. 

In this case and in many other cases where the function under con- 
sideration is a product of several factors, we can simplify the calculation 
by taking the logarithms of both sides before differentiating. In this 
example we have 


log T = log 27 — 4 logg + } log/; 


differentiating, we have 


in agreement with the preceding result. 


*b. Calculation of r 


A different example, using special artificial devices, is classical, although 
perhaps made obsolete by modern computers. 

Leibnitz’s series 7/4 =1 — } +5 —3} +--> [Eq. (7), Section 5.2, 
p. 445], using the series for the inverse tangent, is not suitable for the 
calculation of 7, because of the extreme slowness of its convergence. We 
may, however, calculate m with comparative ease by the following artifice. 
If, in the addition theorem for the tangent, 
tan « + tan f 
ben tat A 1 — tan « tan f’ 
we introduce the inverse functions « = arc tan u, 8 = arc tan v, we obtain 
the formula 


u U 
arc tan u + arc tan v = arc tan (=). 
| — ww 
Now, choosing u and v so that (u + v)/(1 — uwv) = 1, we obtain the value 
7/4 on the right-hand side, and if u and v are small numbers we can easily 


calculate the left-hand side by means of known series. If, for example, we 
put u = 4, v = 4}, as Euler did, we obtain 


(9) 


arc tan $ + arc tan 4. 


If we further notice that 
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we have arc tan $ = arc tan} + arc tan }, so that by (9), 
T 
ri 2arc tan} + arctan}, 


Using this formula, Vega calculated the number 7 to 140 places. 
By means of the equation (4 + 4)/(1 — 3.) = }, we further obtain 


arc tan 4 = arctan} + arctan} 

or 
mT 
7h 2 arctan } + arctan} + 2 arctan}. 
This expansion is extremely useful for calculating 7 by means of the series 
arc tanx = x — 23/3 + 2°/5 —+--; for if we substitute for x the value 
5» 7» OF $, we obtain with but few terms a high degree of accuracy, since the 
terms diminish rapidly. 

The reader who is not especially interested in these skilful, yet artificial 
manipulations, might be satisfied with an understanding of the principle. 


*c. Calculation of Logarithms 


For the numerical calculation of logarithms we transform the loga- 
rithmic series [Eq. (5), p. 444] 


+r r? q? 
3 lo a a 
i F Jo 
where 0 < x < 1, by the substitution 
aa en gee 
l—x p—l! 2p — | 


into the series 


log p= plog (P= 1) Erlea 
<p — 


where 2p? — 1 > l or p? > 1. If p is an integer and p + 1 can be 
resolved into smaller integral factors (for example, if p + 1 is even), 
this last series expresses the logarithm of p by the logarithms of smaller 
integers plus a series whose terms diminish very rapidly and whose sum 
can therefore be calculated accurately enough by use of only a few 
terms. From this series we can therefore calculate successively the 
logarithms of any prime number, and hence of any number, provided we 
have already calculated the value of log 2 (for example, by its integral 
representation, as on p. 489). 
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The accuracy of this determination of log p can be estimated more 
easily by means of the geometric series than from the general formula 
for the remainder. For the remainder R, of the series, that is, the sum 
of all the terms following the term 1/n(2p? — 1)", we have 


1 1 
R, < ——__(14 + t) 
(n + 2)(2p" — 1)" (2p? -1 (2p? — 1)4 
E | | 1 
(n + 2P — 1)” (2p? — 1 — 1° 


and this formula immediately gives the required estimate of the error. 


Let us for example calculate log, 7 (under the assumption that log 2 and 
log 3 have already been found numerically), using the first four terms of the 
series. We have 


p=],  p—1 =97, 


l 
log 7 = 2log2 +4 log3 +— + 


l 
97° FPT 


| ~ 0.01030928, ~ 0.00000037, 


3-973 
2 log 2 = 1.38629436, 3 log 3 ~ 0.54930614; 


hence 
log, 7 ~ 1.94591015. 


Estimation of the error gives 


I l 


] 
Ree a E, 
n 557978 * 972-1 Í 36 x 10! 


However, we note that each of the four numbers which we have added is 
only given to within an error of 5 x 107°, so that the last place in the com- 
puted value of log 7 might be wrong by 2. As a matter of fact, however, the 
last place is also correct. 


6.3 Numerical Solution of Equations 


We add some remarks about the numerical solution of the equation 
(x) = 0, where f(x) need not be a polynomial.’ We start with some 
tentative first value xo of one of the roots and then improve this approxi- 
mation. How the first approximation for the root is chosen and how 
good that approximation is may be left open. We may, for example, 
take a rough guess, or better, obtain a first approximation from the 


1 We are, of course, concerned only with the determination of real roots of f(x) = 0. 
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graph of the function y = f(z), whose intersection with the z-axis 
indicates the required root. 

Then we try to improve the approximation by a process or mapping 
which takes the value zo into a “‘second approximation,” and repeat this 
process. Solving the equation f(z) = 0 numerically consists in carrying 
out such successive approximations repeatedly (or as one says “‘iter- 
ating’ the process) with the expectation that the iterated values 
%,X%q,..., 2, Converge satisfactorily to the root ë. We shall consider 
various such procedures and briefly discuss their accuracy. 


a. Newton’s Method 


Description of Method. Newton's iterative procedure is based on the 
fundamental principle of the differential calculus—the replacing of a 
curve by a tangent in the immediate neighborhood of the point of con- 
tact. Starting from a first approximate value x, for a root & of the 
equation f(x) = 0 we consider the point on the graph of the function 
y = f(x) whose coordinates are x = 2%, y = f(x). To find a better 
approximation for the intersection & of the curve with the x-axis we 
determine the point x, where the tangent at the point z = £o, Y = f (£o) 
intersects the x-axis. The abscissa xz, of this intersection represents 
a new and, under certain circumstances, a better approximation than 
xo to the required root & of the equation. 

Figure 6.4 at once gives 


fzo) = f'(x); 
Ly — Tı 
hence the new approximation 
f (<0) 
(10 t= to — >. , 
= O SE 


Starting with 2, as an approximation, we repeat the process to find 
£a = x, — f(x,)/f (xı) and so on. 

The usefulness of this process depends essentially on the nature of the 
curve y = f(x). In the situation indicated in Fig. 6.4 the successive 
approximations z„ converge with increasing accuracy to the required 
root €. 

However, Fig. 6.5 shows that with a plausible choice of the 
original value 2, our construction need not converge to the required 
root at all. It is therefore necessary to examine in general the circum- 
stances under which Newton's method furnishes useful approximations 
to the solution of the equation. 
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y 


Figure 6.4 Newton’s method of approximation. 


Figure 6.5 


Quadratic Convergence of Newton’s Method 


Assuming that in a sufficiently wide interval about the root & the 
second derivative f”(x) is not “too large” and the first derivative f'(x) 
not “to small”, the main fact concerning Newton’s approximation is 
that the successive “errors” 


h = £ — t, hg =E kasesa hy = FE — i 
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converge to zero quadratically in the sense that |h, ıl < wh,” with a 
fixed constant u. This indicates an extremely rapid rate of convergence; 
if we write the inequality in the form |A u| < lhu? it implies, for 
example, that when |h u| < 10-™ we have |A,,,u| < 10-2”, that is, 
the number of “significant digits” in ux, is doubled at each step. 

The proof of the quadratic convergence is immediate. From the 
relations x,,, = x, — f(z,)/ f (£n) and f(E) = 0 we find that 


Any =E- fein Em 2, — Ee 
By Taylor’s formula 


rAd) — f (Xp) = (é Te x, )f (En) T AG a £)? f" (n), 


where 7 lies between & and z,. Hence 


GN 2 
2f (2) ” 


To establish convergence we assume that x, belongs already to a fixed 
interval € — ô < x < Ẹ + ô in which | f”| has the maximum value Mg, 
|f| the positive minimum value m,, and for which ô is so small that 
340M,/m, < 1. Putting u = }M,/m, we have uô < l and 


Anal < u llnl? < uô lh, l < hnl. 


This inequality shows first of all that z,,, belongs again to the same 
6-neighborhood of & so that the argument can be repeated. Thus, if 
only zy lies in the 6-neighborhood of &, all subsequent x,, will do the 
same. From |h, | < uô |A,,| it follows then that |/,.,;| < (uô Ihol, 
which implies that h, — 0 or that x, — é; moreover, the quadratic law 
of decrease |n| < u |A,,|* will hold for the errors. It is clear then that 
Newton’s method will provide us with a sequence z,, which certainly 
converges toward the solution € provided f’ and f” exist, and are con- 
tinuous near &, that f’(&) Æ 0, and that 2, is already sufficiently close 
to €. The quadratic character of the approximation is often a decided 
advantage of Newton’s method over others (see p. 503). 


(11) hası T 


*h, The Rule of False Position 


Newton’s method is the limiting case of an older method, the “rule 
of false position,” in which the secant appears in place of the tangent. 
Let us assume that we know two points (£o, Yọ) and (2%, ¥;) in the 
neighborhood of the required intersection with the z-axis. If we replace 
the curve by the secant joining these two points, the intersection of this 
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J 


Figure 6.6 The rule of false position. 


secant with the z-axis can be an improved approximation to the re- 
quired root! of the equation. For the abscissa ¢ of the point of inter- 
section, we have (Fig. 6.6) 


E — to -T 


(12) 7 | 
Ff (2p) f(x) 
which leads to 
— Tof (%1) — tif (T0) 
f(z) ai f (2p) 
_ “of (%1) ~ of (To) + of (z0) — Tif (T0) 
f (#1) — f(%o) f 
or 
f (Xo) 
13 = r — — 
my ES TSE 
£y — To 


This formula, which determines the further approximation & from xro 
and 2, constitutes the rule of false position. It is useful if one value of 
the function is positive and the other negative, say as in Fig. 6.6, where 
Yo > Oand y, < 0. 


1 This amounts essentially to linear interpolation applied to the inverse function. 
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The approximation formula of Newton results as a limiting case for 
x, —> Xy, for the denominator of the second term on the right-hand side 
of formula (13) tends to f'’(xọ) as x, tends to 2p. 

Although the rule of false position may be considered more elemen- 
tary than Newton’s method, the latter has the great convenience of 
requiring only one value of x as initial approximation instead of two 
values. 


c. The Method of Iteration 


The Iteration Scheme. We now turn to a far-reaching scheme for 

solving equations written in the form 
x = G(2), 

where ¢ is a continuous function with a continuous derivative. The 
solution of equations of the form f(x) = 0 can be reduced to that of 
x = d(x) if we put d(x) = x — c(x)f(x) where c(x) is any function 
different from zero. 

In the particularly suggestive method of iteration! we begin again 
with a suitably chosen initial approximating value z, and then determine 
a sequence 2, Ta, Zs, ... of values by the conditions 


aii = $(2,), 30 ee ei 


If this “iteration” sequence x, converges to a limit &, then & = @(6) is a 
solution of our equation, since then lim z,,, = & and lim $(z,) = ¢(€) 


n—» © 


n—> © 
because of the continuity of the function @. 


Convergence. The sequence of values x, in the iteration process con- 
verges to a solution under a very general assumption: If the first ap- 
proximation 2, lies in an interval? J about the solution é, in which 


Ip (x) <q 


with a constant g < 1, then x, converges to £. 
For supposing that z lies in J, we have 


x, — E = P(X) — $(6). 


1 Sometimes called the method of successive approximation. The method is used in 
many different mathematical contexts for solving equations of one kind or other. 
2 Although ¢ is unknown, we can very often determine such an interval a priori. 
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By the mean value theorem, the right-hand side of this equation equals 
(zo — &)$'(), where Z lies in J. Thus by our assumption 


lz. — È| < q |% — $l, 


so that x, belongs to J, and then also 


lz, — ¢| < [zı — § |< q? |x — |. 
In general, we obtain 
Iz, — E| < q" [£o — |; 


since q” — 0 as n —> œ, our assertion is proved. 

We see, moreover, from the preceding, that the iteration sequence 
x, does not converge when ¢'(x) > 1 in an interval about £; if |¢’(é)| =1 
we cannot make a general statement. 


Attracting and Repelling Fixed Points 


It is useful to consider the iteration process in terms of a mapping 
or transformation. The function y = ¢(z) represents a transformation 
which maps a point x on the number axis into an image point y of this 
number axis (see p. 20). The solution £ is then a point not changed by 
the transformation ¢, a so-called fixed point, and the problem is thus 
one of finding a fixed point of the mapping; this problem is solvable 
by iteration when |¢’(&)| < q < 1, as we have seen. 

The mapping y = ¢(z) of the neighborhood of the root or fixed 
point & has, for |¢’(z)| <q < 1, the property of being contracting, 
that is, diminishing the distance of the original from the fixed point. 
Such fixed points of contracting mappings are called attracting fixed 
points. Their construction by iteration converges as the terms of a 
geometric series with the quotient q. 

If the root £, or the corresponding fixed point of our transformation 
is in an interval in which |¢’(x)| > r, where r is a constant larger than 1, 
the transformation is expanding, the iteration process diverges, and the 
fixed point is called repelling. 

If at the fixed point we have |¢’(&)| = 1, no general statements con- 
cerning the convergence of the iterations can be made; such fixed 
points are sometimes called indifferent. 

The following observation should be stressed: a fixed point of the 
mapping ¢ is automatically also a fixed point for y, the inverse mapping 
: E = y(6). If |Ð (| >1 in a neighborhood of a root é and 
x = y(y) is the inverse function of ¢, then |y’()| < 1. Thus é is an 
attracting fixed point for this inverse mapping and it is possible to 
replace the originally divergent iteration scheme by a convergent one 
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Figure 6.7 Intersection (&, €) of the curves y = tan z and y = z. 


for the inverse mapping. As an example we consider the equation 
x = tan zr. 


It is clear from the graphs of the functions y = x and y = tan z that 
these intersect somewhere in the interval m < x < 37 and that our 
equation will have a root £ in that interval (Fig. 6.7). Since 


d tan x 1 
a ae ee G 
dx cos“ x 
the iteration procedure with any point 2 in the interval does not con- 
verge. However, we obtain a convergent iteration sequence if we write 
the equation in the inverse form (using the notation arc tan x for the 


principal branch), 


x = arc tanzt + 7. 
Since here 


d 1 
— arc tan r = <4, 
dx 14+ 2 


the sequence defined by z,,, = arc tan z, + m and, say, £o = m, 
converges to &. 
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d. Iterations and Newton’s Procedure 


As mentioned before the solution of an equation of the form 
f(x) = 0 can be reduced to that of the form x = ¢(2) if we choose 
for $ any expression of the form 


P(x) = x — c(x)f(z) 


where c(z) is a nonvanishing function. If we want to solve the resulting 
equation x = ġ(x) by iteration we have to make sure by a suitable 
choice of c(x) that the fixed point £ of the mapping ¢ is “attractive”, 
that is, that |6’()| < 1. Now for the solution € of f(¢) = 0 we have 


P'E) = 1 — cE) f(E) — ESE = 1 — cf"). 


The simplest choice is to take for c(x) the expression 1/f’(x). Then 
certainly |¢’(£)| = 0 < 1. This choice of c(z) leads to the iteration 
sequence 


F(%n) 
Ea) = 2a — 
= CAN 


which is just the sequence of approximations (10), p. 495, in Newton’s 
method. For the error x, —  =h,, we have the estimate 


hnsal = 1P(@,) — AE < An, 


where q is the maximum of |#’(x)| in the interval with end points € and 
£a. Since here 


f(x) Ff") 

f(z) 
and f(x) = f(x) — f(&) = f(n — §), we see that q itself is of the 
order of A,, and thus confirm again the quadratic character of the 
approximation in Newton’s method. 


Another simple choice for c(x) is to take the constant value 1/f"(29), 
leading to the recursion formula 


fe) 
EON A Gia. 


Here ¢'(6) = 1 — f'(©)/f'(x). If f’ is continuous and different from 
zero, we will have an attractive fixed point ¢ if our initial approximation 
% is already so close to the solution é that 


in L'E- FO 
[XO] THER 


$'(x) = 


<i 
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This iteration sequence is somewhat simpler than the one used in 
Newton’s method; however, convergence will be much slower, like 


that for a geometric progression, as is the case with most iteration 
schemes. 


Examples. As an example we consider the cubic equation 
fæ) =2-—2x-—5=0. 
Since f(2) = —1 < 0, f(3) = 16 > 0, a root & certainly exists in the 
interval 2 < a < 3. Since, moreover, f'(x) = 327 —2 > 3(2)? —2 > 0, 


the interval contains only one root. By Newton’s method we find 
starting with the approximation x) = 2 successively 


_ f =? — ao a 
J (2o) 3(2)° — 2 
fæ) _ 14 0.061 


fm) 32.1 —2 
Since f(2.1) > 0, f(2) < 0, the root & lies between 2 and 2.1. In the 
interval 1.9 < x < 2.2, and a fortiori then in the interval E — 0.1 < 
xa<&-+0.1, we have the estimates 
| f"(x)| = |6z| < 6(2.2) = 13.2, 
f(x) = 3r? —2 > 3(1.9)? — 2 = 8.83. 
It follows [see (11), p. 497] that 
13.2 
2(10.83) 


provided |x, — | < 0.1. Since |z — £| = |£ — 2| < 0.1, we find suc- 
cessively 


1 = To 


= 2.1, f(x) = 0.061 


= 2.094568. 


e a 


|é Ez Tritt < 


|x, — E|? < 0.75 |£, — El? 


|[z, — &| < (0.75)(0.1)? = 0.0061 
lz, — E| < (0.61)(0.0061)? < 0.000042. 
If this degree of approximation is not sufficient, we obtain a further 
approximation z, with an error < (0.75)(0.000042)? < 0.000 000 001 3. 
All x, after x) must be larger than £ as is obvious from the fact that 
f’ and f” are positive, which implies that 
Anat = -f (Mh? /2f" (en) < 0. 
Applying instead the rule of false position [(13), p. 498] to the values 


£o xı we find for the intersection & with the z-axis of the secant joining 
the points (Xp, f (£0)) and (x, f(x) 


E = m, LEN 0) L 2.09425 eae, 
f(x) — f(%p) 
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Since the curve is convex in the interval in question, the secant lies 
above the curve and the approximation $ must be less than the root ¢. 
As a second example, let us solve the equation 


f(e) = vlog yz —2=0. 


We have f(3) = —0.6 and f(4) = +0.4, and therefore use x = 3.5 as 
a first approximation. Using ten-digit logarithmic tables we obtain the 
successive approximations 


t= 3.5, xı = 3.598, 
£ = 3.5972849, xa = 3.5972850235. 


Appendix 
*A.1 Stirling’s Formula 


In many applications, particularly in statistics and the theory of 
probability, we find it necessary to have a simple approximation to n! as 
an elementary function of n. Such an expression is given by the follow- 
ing theorem, which bears the name of its discoverer, Stirling (see also 
Chapter 8, p. 630). 


Asn— ©, 
n! 


(14) Jin pation I; 


more exactly, 


(14a) Vln nte” < n! < 2r prt Y2e(1 ne +) 
n 


In other words, the expressions n! and J 2m n"+1/29-" differ only by a 
small percentage when the value of n is large—as we say, the two 
expressions are asymptotically equal—and at the same time the factor 
1 + 1/4n gives us an estimate of the degree of accuracy of the approxi- 
mation. 

We are led to this remarkable formula if we attempt to evaluate 
the area under the curve y = log x. By integration (p. 276) we find that 
A,, the exact area under this curve between the ordinates x = 1 and 


1 The method used here is a special instance of the Euler MacLaurin formula which 
will be discussed in Chapter 8, p. 624. 
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x = n, is given by 
n 


(15) a=] log z dx = xlog z — x| =nlogn—n-+1. 
1 1 


If, however, we estimate the area by the trapezoid rule, erecting 
ordinates at z = 1, x = 2, ..., x =n as in Fig. 6.8, we obtain an 
approximate value T, for the area [cf. (6), p. 483] 


T, = log2 + log3 +--+ + log(n — 1) + ¿logn 


16 
(16) = logn! — $logn. 


If we make the reasonable assumption that 4,, and T, are of the same 
order of magnitude, we find at once that n! and n"*t1/2e—” are of the 


Figure 6.8 


same order of magnitude, which is essentially what is stated in Stirling's 
formula. 

To make this argument precise, we first show that the difference 
a,, = A,, — T, is bounded, from which it will immediately follow that 
T,, = A,(1 —a,/A,) is of the same order of magnitude as A,. The 
difference a,,,; — a, is the difference between the area under the curve 
and the area under the secant in the strip k < x < k + 1. Since the 
curve is concave and lies above the secant, a,,; — a, is positive, and 
An = (An — Ans) + (an1 — Gn_a) +... + (a2 — a) +a, is mono- 
tonic increasing. Moreover, the difference ap}ı — a, is clearly less 
(cf. Fig. 6.9) than the difference between the area under the tangent 
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k k + 1/2 k+l 
Figure 6.9 


at x = k + 4 and the area under the secant; hence we have the in- 
equality 


aktı — 4, < log (k + 3) — ż log k — 4 log (k + 1) 
1 

= ] 1 ++) -— l Í 

ae ai 


Fa. 
2(k + 3) 


< } log + +) — togli F n. 5 | 


Adding these inequalities for k = 1,2,...,n —1, we find that all the 
terms on the right-hand side except two will cancel out, and (since 
a, = 0), we have 


a, < tlog — flog ( +3) < dlogé. 
n 


Since a, is bounded, and in addition monotonic increasing it tends 
to a limit a as n— œ. Our inequality for a,,; — a; now gives us 


= 1 
a — a, = È (ar41 — Ay) < $ log (1 Tt +) 
k=n 2n 
Since by definition A, — T, = a,, we have from (15), (16), 
logn! = 1 — a, + (n+ 3) logn—n, 


or, writing «, = e", 
nt = a neti? e 


The sequence «, is monotonic decreasing and tends to the limit 
æ = e*: hence 


x = 
Lota etme g (1/2) 108 (1+1/2n) 
q 


1 Í 
= /l+—<1+—. 
J 2n 4n 
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Hence we have 


antie" < n! < T ob +) 
4n 
It only remains for us to find the actual value of the limit a. Here 
we make use of the formula (80) of Chapter 3, p. 282: 
= — (n!)?2?" 
m = lim 75... 77 
v n> o (2n)! Vn 
Replacing n! by a„n”t1/2 e” and (2n)! by &g,2?"+1/2 p?®+1/2 ẹ—-2n, we 


immediately obtain 
2 


7 = lim = 
v7 ae Lnr/2 


a? 


= n 


from which « = \/27. The proof of Stirling’s formula is thus complete. 

In addition to its theoretical interest, Stirling’s formula is a very 
useful tool for the numerical calculation of n! when n is large. Instead 
of multiplying together a large number of integers, we have merely to 
calculate Stirling’s expression by means of logarithms which involves 
far fewer operations. Thus for n = 10 we obtain the value 3598696 for 
Stirling’s expression (using seven-figure tables), whereas the exact value 
of 10! is 3628800. The percentage error is barely §%. 


PROBLEMS 


SECTION 6.1, page 482 


1. Prove if f(x) > 0, that the trapezoid rule yields a greater value and the 
tangent rule a lesser value than the exact integral of f. 


2. Estimate the value A = (b — a)/n needed for a calculation by Simpson’s 
rule accurate to p decimal places of 


| a eae | 
(a) log? =| dx, braf adr 


1 T 


3. Estimate in terms of k and s (k <1 and s < 1), the number of points 
needed to calculate within an error e the elliptic integral 


(s) f ol 
US) = m 
o V(1 — 22)(1 — k?z?) ` 


4. Let f(x) be a continuous function on the interval x <x S « + h, with 
a uniformly bounded derivative: |f'(Œ)| < M, for M, a constant. Prove 
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that for any fixed point §,«<& <¢« +A, the estimate 


Mh 
2 


ath 
f f(x) dx — hf) | < 


%0 


5. Calculate | e~*? dx numerically to within 1/100. 


a 


SECTION 6.2, page 490 
1. The period of a pendulum is given by 


where / is the length of the pendulum. If the pendulum drives a clock which 


gains a minute per day determine the necessary correction in /. 


2. To measure the height of a hill, a tower 100 meters high on top of the 
hill is observed from the plain. The angle of elevation of the base of the 
tower is 42° and the tower itself subtends an angle of 6°. What are the limits 
of error in the determination of the height if the angle 42° is subject to an 


error of 1°? 


SECTION 6.3, page 494 


1. (a) To solve the equation x = f(x), show how best to choose the 


constant a so that the iteration scheme 


Phir = Uy + alx, — f] 


converges as rapidly as possible in the neighborhood of the solution. 


(b) Apply this method to solve the equation for V A, 
A 


| Ea 
x 


(c) Show if A > 1 that the number of accurate decimal places is at least 


doubled at each step of the iteration scheme obtained in (b). 
2. (a) Show how best to choose a polynomial 
g(x) =a + bx? 


so that the iteration scheme for VA, 


p 
Ekar = tk +g aes 


converges most rapidly in the neighborhood of the solution. 
(b) Estimate the rapidity of convergence. 


(c) Show how to further improve the convergence by suitable choices of 


polynomials g(x) which are of higher degree. 
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3. Investigate suitable schemes of the type of Problems 1 and 2 for the 
calculation of © A. 


SECTION A.1, page 504 
Yal 


vn! 
1. Prove that lim Lnc A 
e 


n= M 
n+1/2 
*2. By considering Í M log (x + x) dx, « > 0, show that 
x(x +1): (& +n) = ann! n$, 


where a, is bounded below by a positive number. Show that a, is mono- 

tonically decreasing for sufficiently large values of n. [The limit of a, as 
n — œ is 1/T(«).] 

. : . ny'ng!--- ny! 

3. Find an approximate expression for log 7 , wheren, + ng + 

e +n =n. : 


4. Show that the coefficient of x” in the binomial expansion of 


J 
asymptotically given by iat 
nn 


7 


Infinite Sums and Products 


The geometric series, Taylor’s series, and a number of examples 
previously discussed in this book, suggest that we may well study those 
limiting processes of analysis which involve the summation of infinite 
series from a more general point of view. In principle, any limiting value 


S = lim s, 
n> wo 
can be written as an infinite series; we need only put a, = s, — Sn—ı 
for n > 1 and a, = s, to obtain 


Sp = 4, +4, t +4), 


and the value S thus appears as the limit of s,, the sum of n terms, as n 
increases. We express this fact by saying that S is the “sum of the 
infinite series” 

Qa, ta +t azt: 


Such an “infinite sum” is simply a way of representing a limit where 
each successive approximation is found from the preceding by adding 
one more term. Thus the expression of a number as a decimal is in 
principle merely the representation of a number a in the form of an 
infinite series a = a, + ag + a, + '':, where, if 0 < a < 1, the term 
a„ is replaced by a, x 107” and «, is an integer between 0 and 9 
inclusive. 

Since every limiting value can be written in the form of an infinite 
series, a special study of series may seem superfluous. However, very 
often it happens that limiting values occur naturally in the form of 
such infinite series which exhibit particularly simple laws of formation. 


510 
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Not every series has an easily recognizable law of formation. For 
example, the number 7 can certainly be represented as a decimal (which 
is a series Lc, 107”), yet we know no simple law enabling us to state the 
value of an arbitrary digit, say the 7000th, of this decimal. If, however, 
we consider the Leibnitz-Gregory series for 7/4 instead, we have an 
expression with a perfectly clear general law of formation [see (7), 
p. 445]. 

Analogous to infinite series, in which the approximations to the 
limit are formed by repeated addition of new terms, are infinite products, 
in which the approximations to the limit arise from repeated multi- 
plication by new factors. We shall not go deeply into the general 
theory of infinite products, however; the principal subject of this 
chapter and of Chapter 8 will be infinite series. 


7.1 The Concepts of Convergence and Divergence 


a. Basic Concepts 


Cauchy's Convergence Criterion. We consider an infinite series with 
the “general term” a,; the series! is then of the form 


Apa ot =a. 


v=] 
The symbol on the right with the summation sign is merely an abbre- 
viated way of writing the expression on the left. 
If as n increases, the nth partial sum 


n 
S,=a,+a,+°':+a,=)a, 
approaches a limit = 
S = lim s, 
n> © 
we Say that the series is convergent; otherwise we Say that it is divergent. 
In the first case we call S the sum of the series. 

We have already encountered many examples of convergent series; 
for instance, the geometric series 1 + q + q? + : >: , which converges 
to the sum 1/(1 — q) when |q| < 1, the series for log 2, the series for e, 
and others. 

In the language of infinite series, Cauchy’s convergence test (cf. 
Chapter 1, p. 75) is expressed as follows: 


1 For formal reasons we include the possibility that certain of the numbers a, may be 
zero. If all terms from an index N onward (that is, when n > N) vanish, we speak 
of a terminating series. 
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A necessary and sufficient condition for the convergence of a series is 
that the number 


(1) [Sm Jz Spl = [an+ +F An+2 a n aml 


(m > n), becomes arbitrarily small if m and nare chosen sufficiently large. 
In other words: A series converges if, and only if, the following con- 
dition is fulfilled: for a given positive number «, it is possible to choose 
an index N = N(e), in such a way that the above expression |s,, — S,,| 
is less than «, provided only that m > N and n > N. 


We can illustrate the convergence test by the geometric series for 
q =%. If we choose « = Ñ, we need only take N = 4. For 


1 
Sm — Sal = an to + 


Qm-l 


1 /i 1 1 1 
z7 2 T 22 + + gm-n < 9n-l 


1 
<= ifn > 4. 


and = iT 


If we choose « equal to 545, it is sufficient to take 7 as the corresponding 
value of N, as may easily be verified. 


Obviously, it is a necessary condition for the convergence of a 

series that 

lima, = 0. 

N+ 
Otherwise, the convergence criterion certainly cannot be fulfilled for 
m =n + 1. But this necessary condition is by no means sufficient for 
convergence; on the contrary, it is easy to find infinite series whose 
general term a, approaches 0 as n increases, but whose sum does not 
exist, since the partial sum s,, increases without limit as n increases. 


Examples. An example is the series 


I 1 1 
lf =z ete 
V2 v3 vn 


the general term of which is I/v n. We immediately see that 


1 1 n = 
S, > met te Se ee. 
Jn Jn s/n 


The nth partial sum increases beyond al! bounds as n increases, and 
therefore the series diverges. 
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The same is true for the classic example of the harmonic series 


Meda 4 Baek, 
2 3 4 
Here 
LO EE E ae ee ee eae ae ee 
n * nAi 2n” 2n 2n 2 


Since n and m = 2n can be chosen to be as large as we please, the 
series diverges, for Cauchy’s test is not fulfilled; in fact, the nth partial 
sum obviously tends to infinity, since all the terms are positive. On the 
other hand, the series formed from the same numbers with alternating 
signs, 

n—1 
tn Bie te (~1)" 


l 
+- — tree, 
5 n 


converges [cf. (4) Chapter 5, p. 443], and has the sum log 2. 
It is by no means true that in every divergent series s„ tends to + 00 
or — œ. Thus in the series 


Siti, 


we see that the partial sum s,, has the values 1 and 0 alternately, and 
on account of this oscillation backward and forward, neither approaches 
a definite limit nor increases numerically beyond all bounds. 

The following fact, although it is self-evident, is very important and 
should be noted. The convergence or divergence of a series is not changed 
by inserting a finite number of terms or by removing a finite number of 
terms. As far as convergence or divergence is concerned, it does not 
matter in the least whether we begin the series at the term dy, or a, or 
aş, or any other term chosen arbitrarily. 


b. Absolute Convergence and Conditional Convergence 


The harmonic series 1 + 4 + } + }--- diverges, but if we change 
the sign of every other term the resulting series for log 2 converges. 
On the other hand, the geometric series 1—q+q?—q?+—'"": 
converges and has the sum 1/(1 + q), provided that O <q < 1, and 
on making all the signs plus we obtain the series 


Dg Gg hig 25 


which is also convergent, having the sum 1/(1 — q). 
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Here there appears a distinction which we must examine. With a 
series whose terms are all positive there are only two possible cases; 
either it converges or the partial sum increases beyond all bounds as n 
increases. For the partial sums, being a monotonic increasing sequence, 
must converge if they remain bounded. Convergence occurs if the 
individual terms approach zero rapidly enough as n increases; on the 
other hand, divergence occurs if the terms do not approach zero at all 
or if they approach zero too slowly. However, in series some terms of 
which are positive and some negative, it may be that the changes of sign 
bring about convergence, when too great an increase in the partial sums, 
due to the positive terms, is compensated by the negative terms, so that 
as the final result a definite limit is approached. Z 

To understand the possibilities better we consider a series > a, having 


v=1 


positive and negative terms and form for comparison the series which 
has the same terms all with positive signs, that is, 


la| + la| +: => lal. 
y=] 


If this series converges, then for sufficiently large values of n and m > n, 
the expression 
lansal F lansol + d F |a ml 


will certainly be as small as we please; because of the relation 
CARE T = + aml < Qn + oo + la ml 


the expression on the left is also arbitrarily small, and so by the Cauchy 


(e0) 
test the original series È a, converges. In this case the original series is 
v=1 
said to be absolutely convergent. Its convergence is due to the absolute 
smallness of its terms and does not depend on the changes in sign. 
If, on the other hand, the series with the terms |a,| diverges and the 
original series still converges, we say that the original series is con- 
ditionally convergent. Conditional convergence results from the terms 


of opposite signs compensating one another. 


Leibnitz’s Test. For conditional convergence Leibnitz’s convergence 
test is frequently useful: 
If the terms of a series are of alternating sign and in addition their 
absolute values |\a,\ tend monotonically to 0 (so that |\a,,,| € |a,|), the 
ie 6) 
series Ý a, converges. [Example: Leibnitz’s series, (7), p. 445.] 


v=] 
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For the proof we.assume that a, > 0, which does not limit the 
generality of the argument, and write our series in the form 


Dy 0g tbim pe 


where all the terms 5, are now positive, b, tends to zero, and the 
condition b,,, < b, is satisfied. If we bracket the terms together in the 
two different ways 


bı — (by — bs) — (by — b) — + °° 
and (by — by) + (b3 — by) + (bs — bg) + ° °° 


n 
we see at once that the partial sums s, = > a, satisfy the following 
1 


two relations 
Sy S83 2 85 2S Sana es 


S2 L S4 SSS Hy, Se 


On the other hand, 55, < Seni1 < S1 and 5Sp,43 > Son 2 S2 The odd 


partial sums s4, S3, . . . therefore form a monotonic decreasing sequence, 


Figure 7.1 Convergence of an alternating series. 


which in no case falls below the value s,; hence this sequence possesses 
a limit L (p. 73). The even partial sums s», s4, ... likewise form a 
monotonic increasing sequence whose terms in no case exceed the fixed 
number s,, and therefore this sequence must have a limiting value L’. 
Since the numbers sə, and s,,,,, differ from one another only by the 
number b,,,,, which approaches 0 as n increases, the limiting values L and 
L’ are equal to one another. That is, the even and the odd partial sums 
approach the same limit, which we now denote by S (cf. Fig. 7.1). This, 
however, implies that our series is convergent, as was asserted; its 
sum is S. 


*Abel’s Test 

A test for conditional convergence that includes the Leibnitz test as a 
special case is Abel’s convergence test. Let a, + a, + ‘~~ be an infinite series 
whose partial sums Sp = a, +--: +a, are bounded independently of n. 
Let pi, Po, . .. be a sequence of positive numbers decreasing monotonically to 
the value zero. Then the infinite series 


(2) Pıtı + Pod, °° 
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converges. (For the special series a) +a, +°'°= +1—-1+1—-1+-'":: 
we find that py — Pa + pg — > converges, which is Leibnitz’ test.) The 
proof follows if we apply Cauchy’s test using “summation by parts” to 
estimate 


[Pnn + PnioQnse +`’: + Pmaml 
= |Pnsint1 — Sn) + PnyolSnt2 — Snya) + °° + Pm(Sm — Smi) 
= |—Pnsi5n + PmSm + (Pnit — Pns2)Sn41 + (Paso — Pr 3)Sn42 re’ 
+ (Pmi — Pm)Sm-il 
SPntiM + PmM + (Pris — Pno + Pasa — Pnis E — ` °° E Pmi — Pm) M 
= 2PniiM, 


where M is a bound for the|s;|; since p,,,; >O the convergence of the 
series (2) follows by Cauchy’s test. 


*In conclusion, we make another general remark about the funda- 
mental difference between absolute convergence and conditional con- 


co 
vergence. We consider a convergent series > a,. We denote the positive 
y=1 


terms of the series by pı, Ps, Ps, ..., and the negative terms by —q,, 


—o, —G3,... . If we form the nth partial sum s, = > a, of the given 
v=1 

series, a certain number, say n’, of positive terms and a certain number, 

say n”, of negative terms must appear, where n’ + n” = n. Fur- 

thermore, if the number of positive terms as well as the number of 

negative terms in the series is infinite, then the two numbers n’ and 

n” will increase beyond all bounds as n does. We see immediately 


that the partial sum s, is simply equal to the partial sum È p, of 
n” v=1 

the positive terms of the series plus the partial sum — > q, of the nega- 
v=1 

tive terms. If the given series converges absolutely, then the series of posi- 


tive terms J p, and the series of absolute values of the negative 
0 v=] 
terms > q, certainly both converge. For as m increases, the partial sums 
m v=] m 
p, and > q, are monotonic nondecreasing sequences with the upper 


v=1 œ y=l 


The sum of an absolutely convergent series is then simply equal to the 
sum of the series consisting of the positive terms only, plus the sum of the 
series consisting of the negative terms only, or, in other words, is equal 
to the difference of the two series with positive terms. 
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n’ n” 
For > a, = > p,— > q,; as n increases n’ and n” also increase 
v=1 v=1 


v=1 
beyond all bounds, and the limit of the left-hand side must therefore be 
equal to the difference of the two sums on the right. If the series con- 
tains only a finite number of terms of one particular sign, the facts are 
correspondingly simplified. If, on the other hand, the series does not 
converge absolutely, but does converge conditionally, then the series 


> p, and > q, must both be divergent. For if both were convergent the 
v=1 v=] 


series would converge absolutely, contrary to our hypothesis. If only 


[9.0] 
one diverged, say > p, and the other converged, then separation into 
v=] n’ n” 
positive and negative parts, s, = > p, — > q, shows that the series 
v=1 v=1 


n 
could not converge; for as n increases n’ and > p, would increase beyond 
n” v=l 
all bounds, whereas the term > q, would approach a definite limit, so 
v=1 


that the partial sum s, would increase beyond all bounds. 


We see, therefore, that a conditionally convergent series cannot be 
thought of as the difference of two convergent series, the one consisting 
of its positive terms and the other consisting of the absolute values of its 
negative terms. 


Closely connected with this fact is another difference between abso- 
lutely and conditionally convergent series which we shall now briefly 
mention. 


*c, Rearrangement of Terms 


It is a property of finite sums that we can change the order of the terms or, 
as we Say, rearrange the terms at will without changing the value of the sum. 
The question arises: what is the exact meaning of a change of the order of 
terms in an infinite series, and does such a rearrangement leave the value of 
the sum unchanged? Although in finite sums there is no difficulty, for 
example, in adding the terms in reverse order, in infinite series such a pos- 
sibility does not exist; there is no last term with which to begin. Nowa change 
of order in an infinite series can only mean this: we say that a series 
a, + a, + a3 +: is transformed by rearrangement into a series b} + bz + 
b, +--+, provided that every term a, of the first series occurs exactly once 
in the second and conversely. For example, the amount by which a, is 
displaced may increase beyond all bounds as does; the only point is that 
a,, must appear somewhere in the new series. 1f some of the terms are moved 
to later positions in the series, other terms must, of course, be moved to 
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earlier positions. For example, the series 
L+qt+@PtgGit+Pt+tPtPr+tPtPt+ ge t::: 


is a rearrangement’ of the geometric series 1 +g +g? +°°°. 
With regard to change of order there is a fundamental distinction between 
absolutely convergent series and conditionally convergent series. 


In absolutely convergent series rearrangement of the terms does not affect 
the convergence, and the value of the sum of the series is unchanged, exactly 
as in finite sums. 

In conditionally convergent series, on the other hand, the value of the sum 
of the series can be changed at will by suitable rearrangement of the series, 
and the series can even be made to diverge if desired. 


The first of these facts, referring to absolutely convergent series, is easily 
established. Let us assume initially, that our series has positive terms only, 


and consider the nth partial sum s, = >) a,. All the terms of this partial 


Ii Mea 


v=1 


m 

sum occur in the mth partial sum /,, = > b, of the rearranged series, provided 
v=1 

only that m is chosen large enough. Hence tm > s,. On the other hand, we 


n 
can determine an index n’ so large that the partialsums,, = > a, of the first 
v=] 
series contains all the terms b4, bə, . . . , bm. It then follows that t,, < Sw < A, 
where A is the sum of the first series. Thus for all sufficiently large values 
of m we have Ss, < tm <A, and since s, can be made to differ from A by an 
arbitrarily small amount, it follows that the rearranged series also converges 
and, in fact, to the same limit A as the original series. 

If the absolutely convergent series has both positive and negative terms, 
we may, in fact, regard it as the difference of two series each of which has 
positive terms only. Since in the rearrangement of the original series each 
of these two series merely undergoes rearrangement and therefore converges 
to the same value as before, the same is true of the original series when 
rearranged. For by the case just considered the new series is absolutely 
convergent and is therefore the difference of the two rearranged series of 
positive terms. 

To the beginner the fact just proved may seem a triviality. That it really 
does require proof, and that in this proof the absolute convergence is essential, 
can be shown by an example of the opposite behavior of conditionally 
convergent series. We take the familiar series for log 2, below which we 
write the result of multiplication by the factor 3, 


1-4+}-ł}+}-8+}-}$+-' =loga, 
$ =i +5 -$ + — +: =} log2, 


1! For each n > O the terms g* with 2” < k < 2”+?! are written in reverse order. 
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and add, combining the terms placed in vertical columns.’ We thus obtain 
P+ —-4h4+34+4-34+3 4H -t4+4+---: = $ log 2. 


This last series can obviously be obtained by rearranging the original series, 
and yet the value of the sum of the series has been multiplied by the factor 3. 
It is easy to imagine the effect that the discovery of this apparent paradox 
must have had on the mathematicians of the eighteenth century, who were 
accustomed to operate with infinite series without regard to their convergence. 

*We shall give the proof of the above theorem concerning the change 
in the sum of a conditionally convergent series X a,, which arises from change 
of order of the terms, although we shall have no occasion to make use of the 
result. Let pı, Po, ... be the positive terms and —q,, —qə, ... the negative 
terms of the series. Since the absolute value |a,| tends to O as n increases, 
the numbers p„ and q, must also tend to 0 as n increases. As we have eee) 


seen, moreover, the sum > P, Must diverge, and the same is true of > qy. 


Now we can easily find : a rearrangement of the original series afich has an 
arbitrary number a as sum. Suppose, to be specific, that a is positive. We 


then add together the first 1, positive terms, just enough to bring about that 
ny ny 

the sum Ý p, is greater than a. Since the sum È p, increases with n) beyond 
1 


1 
all bounds, it is always possible by using enough terms to make the partial 
sum greater than a. The sum will then differ from the exact value a by py, 


my 


at most. We now add just enough negative terms — > q, to ensure that the 
1 


my 


1 
sum dP, — $4, is less than a; this is also possible, as follows from the 


1 
a 


divergence of the series È g,. The difference between this sum and a is now 
1 


Ny 


Gm, at most. We now add just enough other positive terms > p, to make 
n+l 

the partial sum again greater than a, as is again possible, since the series of 

positive terms diverges. The difference between the partial sum and a is now 


mM 
Pn, at most. We again add just enough negative terms — > q,, beginning 
m,+1 
next after the last one previously used, to make the sum once more less than a, 
and continue in the same way. The values of the sums thus obtained will 
oscillate about the number a, and when the process is carried far enough the 
oscillation will only take place between arbitrarily narrow bounds; for, 
since the terms p, and q, themselves tend to 0 when » is sufficiently large, the 
length of the interval in which the oscillation takes place will also tend to 0. 


The theorem is thus proved. 


1 For the addition of series see Section 7.1d. 
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In the same way we can rearrange the series in such a way as to make it 
diverge: we have only to choose such large numbers of the positive terms as 
compared with the negative that compensation no longer takes place. 


d. Operations with Infinite Series 


It is clear that two convergent infinite series a; + a +°: =S 
and b, + b +++: = T can be added term by term, that is, that the 
series formed from the terms c, =a, + b, converges and has the 
value S + T for its sum.! For 


n n n 
Yc =da,+>5,>S4 T. 
v=] v=] v=1 

It is also clear that if we multiply each term of a convergent infinite 
series by the same factor, the series remains convergent, its sum being 
multiplied by the same factor. 

For these operations it is immaterial whether the convergence is 
absolute or conditional. On the other hand, further study shows that 
multiplication of two infinite series by the method used in multiplying 
finite sums does not necessarily lead to a convergent series for the value 
of the product, unless at least one of the two series is absolutely con- 
vergent (cf. Appendix, p. 555). 


7.2 Tests for Absolute Convergence and Divergence 


In Section 7.1b we have already encountered Leibnitz’ useful test for 
the conditional convergence of series. In the following pages we shall 
only consider criteria referring to absolute convergence. 


a. The Comparison Test. Majorants 


All such considerations of convergence depend on the comparison 
of the series in question with a second series; this second series is 
chosen in such a way that its convergence can readily be tested. The 
general comparison test may be stated as follows: 


co 
If the numbers by, ba, . . . are all positive and the series > b, converges, 
and if va 
lanl < b, 


[oa] 
for all values of n, then the series Ẹ a,, is absolutely convergent. 


n=1 


1 This theorem is really nothing more than another statement of the fact (cf. Chapter 
1, p. 72) that the limit of the sum of two terms is the sum of their limits. 
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By Cauchy’s test the proof becomes almost trivial. For if m >n, 
we have 


la, +i + anl S lanl eee H anl <b, °°: +8,,. 


Since the series È b,, converges, the right-hand side is arbitrarily small, 
n=1 

provided that n and m are sufficiently large. It follows that for such 
values of n and m the left-hand side is also arbitrarily small, so that by 
Cauchy’s test the given series converges. The convergence is absolute, 
since our argument applies equally well to the convergence of the series 
of absolute values |a,|. 

The analogous proof for the following fact can be left to the reader. 


if la,| > b, > 0, 


[e 8) 0 
and the series > b, diverges, then the series Ẹ a, is certainly not abso- 
n=1 n=1 
lutely convergent. 
Sometimes the above series with the positive terms b,, are called 


majorant and minorant series, respectively, for the one with terms a,,. 


b. Convergence Tested by Comparison with the Geometric Series 


In applications of the test the comparison series most frequently 
used as a majorant is the geometric series. We at once obtain the 
following theorem. 

THEOREM. The series > a, is absolutely convergent if from a certain 


n=1 
term onward a relation of the form 


(3) la,| < cq” 


holds, where c is a positive number independent of n and q is any fixed 
positive number less than 1. 


Ratio and Root Tests. This test is usually expressed in one of the 


(e9) 
following weaker forms: the series >} a, converges absolutely, if 
n=1 


from a certain term onward a relation of the form 


A n+l 


a 


(4a) <q 


n 
holds, where q is again a positive number less than | and independent 
of, or: if froma certain term onward a relation of the form 


(4b) Vial <4 
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holds, where q is a positive number less than 1. In particular, the 
conditions of these tests are satisfied if a relation of the form 


(Sa) lim a =k<l 
or e Í 

(5b) lim vla =k<1 
is true. 


These statements are easily established in the following way. 

Let us suppose that the criterion (4a), the ratio test, is satisfied from 
the suffix my onward, that is, when n > nọ, For brevity we put 
a,,,+m+1 = Öm and find that 


las] < q lbol, [bal < q Ibil < q? lbol, [bs] < q [Be] < g? [dol 


and so on; hence 
LA < q™ | bol, 


and then for n > n, and c = q-"°"? |b, 


an] = lbnr] <9" [bol 
= cq” 


which establishes our statement. For the criterion (4b), the root test, 

we at once have |a,,| < q”, and our statement follows immediately. 
Finally, in order to prove the criteria (5), we consider an arbitrary 

number g such that k <q < 1. Then from a certain ny onward, that is, 


when n > Mo, Eqs. 4a, b imply that a <q and W|a,| < q respec- 


n 


l l Any 
tively, since from a certain term onwards the values of | —+ 


or of Va, 


n 


differ from k by less than (qg — k). The statement is then established 
on the basis of the results already proved. 

We stress the point that the four tests 4a, b, Sa, b, derived from 
the original criterion |a,| < cq" are not equivalent to one another or to 
the original, that is, that they cannot be derived from one another in 
both directions. We shall soon see from examples that if a series satisfies 
one of the conditions, it need not satisfy all the others. 
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For completeness it may be pointed out that a series certainly 
diverges if from a certain term onward 


la,| > ¢ 


for some positive number c, or if from a certain term onward 


Vjal > 1, 
or if lim | =€] =k, or lim yla,| =k, 
n> o a n> œ 


n 


where k is a number greater than 1. For, as we immediately recognize, 
in such a series the terms cannot tend to zero as n increases; the series 
must therefore diverge. (In these circumstances the series cannot even 
be conditionally convergent.) 

Our tests furnish sufficient conditions for the absolute convergence of 
a series; that is, when they are satisfied we can conclude that the series 
converges absolutely. They are definitely not necessary conditions, 
however; that is, absolutely convergent series can be formed which do 
not satisfy the conditions. 

Thus the knowledge that 


anit 


a, 


=] or limW{a,| = 


N-* œ© 


lim 


n—-+ 


does not imply anything about the convergence of the series. Such a 
series may converge or diverge. For example, the series 


nan 
. . nT an+ S. Ja 
for which lim v |a,| = I and lim = |, is divergent, as we saw on 
a 
n> D nee W n 


l 
p. 513. On the other hand, as we shall soon see, the series A , which 
satisfies the same relations, is convergent. n 


As an example of the application of our tests we first consider the series 


g +28 +3 +e tng" +: 
For this series 
lim ¥ Ja,| = |g|-lim Vn = |ql, 


N+» D n= © 

n+l 
lim = |g|° im— = igl. 
n —> n n—»Q 


That the series converges if |q] < 1 follows from the ratio test and from the 
root test also, even in the weaker form (5). 
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If, on the other hand, we consider the series 
1 +29 + +2 +.. +g” +U e, 


we can no longer prove convergence by the ratio test when $ < |g| < 1; 
2n+1 
q 


for then = 2|q| 21. But the root test immediately gives us 


q?” 


lim V’ja,| = ||, and shows that the series converges provided that |g] < 1, 


now 


which, of course, we could also have observed directly. 


c. Comparison with an Integral’ 


We now proceed to discuss quite a different method of studying 
convergence. We shall explain it for the typical, particularly simple 
and important case of the series 


n=1 n” 2 3% 


where the general term a, is 1/n*, « being a positive number. In order 
to investigate the convergence or divergence of this series, we consider 
the graph of the function y = 1/xz* and mark off on the z-axis the in- 
tegral abscissae x = 1,2 = 2,.... We first construct the rectangle of 
height 1/n* over the interval n — 1 < x < n of the z-axis (n > 1), and 
compare it with the area of the region bounded by the same interval 
of the x-axis, the ordinates at the ends, and the curve y = 1|/x* (this 
region is shown shaded in Fig. 7.2). Secondly, we construct the 


Figure 7.2 Comparison of series with an integral. 


1 In this connection see also the Appendix to Chapter 5, p. 505. 
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rectangle of height 1/n* lying above the intervaln < x < n + I, and 
similarly compare it with the area of the region lying above the same 
interval and below the curve (this region is cross-hatched in Fig. 7.2). 
In the first case the area under the curve is obviously greater than the 
area of the rectangle; in the second case it is less than the area of the 
rectangle. In other words, 


"de 1 ò" da 
f z<i] = 
n x n n—l T£ 


Writing down these inequalities for n = 1, 2, 3, ... , m, respectively 
n = 2, 3,...,m, and summing, we obtain the following estimate for 
m ] 
the mth partial sum s„, = > —: 
n=1 n” 
m+1 m 
(6) Í fm <i +| = 
i g 1 2 


m l 
Now as m increases the integral f — dx tends to a finite limit or 
1 X 


increases without limit depending on whether «> lora <l. 
Consequently, the monotonic sequence of numbers s, is bounded or 
increases beyond all bounds depending on whether a > lore <1, 
and we thus have the following theorem. 


THEOREM. The series of reciprocal powers 


<] 1 l 1 
ee ee S 
2, ne Ae DP 
is convergent if and only ifa > 1. 


For « = | the divergence of the harmonic series, which we previously 
proved in a different way, is an immediate consequence; likewise the 
series 


converge while the one ral t+ ++ diverges. 


J2 


| 
The convergent series De — for « > 1 frequently serve as comparison 
v=1 V 


series in investigations of convergence. For example, we see at once that 
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for « > 1 the series 52 — converges absolutely if the absolute values |c,] 
v=1 


of the coefficients remain less than a fixed bound independent of ». 


Euler’s Constant. From the estimate (6) for « = 1 it follows at once 


1 1 l 
that the sequence of numbers C, = 1 + 5 +3 +: +- — logn = 
n 


S, — logn > log (n + 1) — logn > 0 is bounded below. Since from 
l n+] dx l 
the i lity ——— — =] 1)-— 1 —— 

e inequality —<| m og (n + 1) — logn = Ao 
C, — Cri, we see that the sequence is monotonic decreasing, it must 
approach a limit 

. . 1,1 1 
lim C, = lim (1 titii logn) =C. 
n 


n — 0 n> æ 3 


The number C whose value is 0.5772 ..., is called Euler’s constant. In 
contrast to the other important special numbers of analysis, such as 7 
and e, no other expression with a simple law of formation has been 
found for Euler’s constant. Whether C is rational or irrational is not 
known to this day. 


7.3 Sequences of Functions 


As emphasized frequently before, the limit process serves not only 
to represent known numbers approximately by other, simpler ones, 
but it also serves to extend the set of known numbers into a wider one. 
It is of decisive importance in analysis to study limits not only for 
sequences—or infinite series—of constant numbers, but similarly for 
sequences of functions, or series whose terms are functions of a variable 
x, as, for example the Taylor series or power series in general. Not only 
the approximation of given functions by simpler ones requires such 
limiting processes but also the definition and analytic description of 
new functions must frequently be based on the concept of limit of 
sequences of functions: f(x) = lim/f,(x) for n— œ. Equivalently, 
we may consider f(x) as the sum and the f,„(x) as the partial sums of 


an infinite series f(x) = Ye) of functions g,(x) where g,(x) = 
AC) re (2) for n > l and g(x) = = f,(2). 


We shall now discuss precise definitions and geometrical inter- 
pretations. 
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a. Limiting Processes with Functions and Curves 


Definition. The sequence f,(x), fo(x), ... converges in the interval 
a < x < b to the limit function f(x), if at each point x of the interval 
the values f„(x) converge in the usual sense to the value f(x). In this case 
we write lim f (x) = f(z). According to Cauchy’s test (cf. p. 75) 


n-*»*c 


we can express the convergence of the sequence without referring to 
the limit function f(x): The sequence of functions converges to a limit 
function if and only if at each point x in our interval and for every 
positive number e, the quantity |f (Œ) — f,,(x)| is less than e, pro- 
vided that n and m are chosen large enough, that is, larger than a 
certain number. This number N = Me, x) usually depends on e and z 
and increases beyond all bounds as e tends to zero. 

We have frequently met with cases of limits of sequences of functions. 
We mention only the definition of the power x* for irrational values of 
a by the equation 

x* = lim r”, 


n> p 
where rj, Fo... Fa... 1S a sequence of rational numbers tending to 
a; or the equation 
x\" 
e” = lim (i +2}, 
n> a n 


where the approximating functions f„(x) on the right are polymomials 
of degree n. 

The graphical representation of functions by means of curves suggests 
that we can also speak of limits of sequences of curves, saying, for 
example, that the graphs of the preceding limit functions x” and e” 
are to be regarded as the limit curves of the graphs of the functions 


a’ and (1 + =) respectively. 
n 


There is, however, a fine distinction between passages to the limit 
with functions and with curves, not clearly observed until the middle of 
the nineteenth century. We shall illustrate this point by an example and 
then discuss it systematically in the next section. 


We consider the functions 
fe) = 2", WESEL ae 


in the interval 0 < x < 1. All these functions are continuous, and the 
limit function lim f,(z) = f(x) exists. But this limit function is not 


n— © 
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continuous. On the contrary, since for all values of n the value of the 
function /,(1) = 1, the limit 
fQ) = 1; 


while, on the other hand, for 0 < x < 1, the limit f(z) = lim f,(x) = 0, 


as we saw in Chapter 1, p. 65. The function f(x) is therefore a dis- 
continuous function which at x = 1 has the value 1 while for all other 
values of x in the interval has the value 0. 


Figure 7.3 Limit curve and limit function. 


This discontinuity is geometrically illustrated by the graphs C, of the 
functions y = f,(x). These (cf. Fig. 1.44, p. 66) are continuous curves, 
all of which pass through the origin and the point x = 1, y = 1, and 
which draw in closer and closer to the z-axis as n increases. The curves 
do possess a limit curve C which is not discontinuous at all, but consists 
(cf. Fig. 7.3) of the portion of the x-axis between x = 0 and x = 1, and 
the portion of the line x = 1 between y = 0 and y = 1. The curves 
therefore converge to a continuous limit curve with a vertical portion, 
whereas the functions converge to a discontinuous limit function. We 
thus recognize that this discontinuity of the limit function expresses 
itself by the occurrence in the limit curve of a portion perpendicular to 
the z-axis. This limit curve is not the graph of the limit function; for 
corresponding to the value of x at which the vertical portion occurs the 
curve gives an infinite number of values of y and the function only one. 
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Hence the limit of the graphs of the functions f(x) is not the same as 
the graph of the limit f(x) of these function. 

Corresponding statements, of course, hold for infinite series as well. 
7.4 Uniform and Nonuniform Convergence 


a. General Remarks and Definitions 


The distinction between the concept of the convergence of functions 
and that of the convergence of curves is a phenomenon which the 


Y = fn(x) FAN 
pne =< y=f(x) +e 
-7> y= f(x) 


/ 


y =f(x)-€ 


Figure 7.4 To illustrate uniform convergence. 


student should clearly grasp. This involves the so-called nonuniform 
convergence of sequences or infinite series of functions which we shall 
discuss in some detail. 

That a function f(x) is the limit of a sequence f(x), fa(x),... in an 
interval a < x < b means by definition merely that the usual limit 
relationship f(x) = lim f, (x) holds at each point x of the interval. 


n—> o 

Such convergence is a local property of the sequence at the point zx. 
Jt is, however, natural to require somewhat more than the mere local 
convergence of our approximations: that if we assign an arbitrary 
measure of accuracy e, then from a certain index N onward all the func- 
tions f„(x) should lie between f(x) — « and f(z) + «e, for all values of 
x, so that their graphs y = f„(x) lie entirely in the strip shown in Fig. 7.4. 
If the accuracy of the approximation can be made at least equal to a 
preassigned positive number e, everywhere in the interval at the same 
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time, that is, by everywhere choosing the same number M(e) independent 
of x, we say that the approximation is uniform.’ If lim f(x) = f(x) 


n— %0 


uniformly for a < x < b, there exists for every « > 0 a corresponding 
number N = N(e) such that |f(x)— f,(x)|< € for all n> N and 
all x in the interval. Many people were quite surprised when in the 
middle of the nineteenth century it was noticed by Seidel and others 
that convergence of functions need not at all be uniform as had been 
naively assumed. 


Examples of Nonuniform Convergence. The concept of uniform convergence 
is illuminated by examples of nonuniform convergence. 


(a) The first example occurs for the sequence of functions just considered, 
f(x) = x”; in the interval 0 < x < 1 this sequence converges to the limit 
function f(x) = 0 for 0 <x <1, f(1) =1. Convergence occurs at every 
point in the interval; that is, if « is any positive number, and if we select 
any definite fixed value x = &, the inequality |£” — f(&)| < « certainly holds 
if n is sufficiently large. Yet this approximation is not uniform. For, if we 
choose « = 4, then no matter how large the number n is chosen, we can find 
a point x = n # 1 at which |n” — f(n)| = 7” > 4; this is, in fact, true for 
all points x = 7 where 1 > 9 > T}. It is therefore impossible to choose 
the number n so large that the difference between f(x) and f(x) is less than $ 
throughout the whole interval. 

This behavior becomes intelligible if we refer to the graphs of these 
functions (Fig. 7.3). We see that no matter how large a value of n we choose, 
for values of $ only a little less than 1 the value of the function /,(&) will be 
very near |, and therefore cannot be a good approximation to /(&), which is 0. 

Similar behavior is exhibited by the functions 


fE) >a 


1 + 22” 


in the neighborhood of the points x = 1 and x = —1; this can easily be 
established. Here f(x) = 1 for |x| < 1, f(x) = 4 for |x| = 1 and f(x) =0 
for |z| > 1. 


(b) In the above two examples the nonuniformity of the convergence is 
connected with the fact that the /imit function is discontinuous. Yet it is also 
easy to construct a sequence of continuous functions which do converge to a 
continuous limit function, but not uniformly. We restrict our attention to 


1 Compare with the analogous definition at uniform continuity, p. 41, where we can 
choose the same number 6(e) independent of x. 
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the interval 0 < x < 1 and make the following definitions for n > 2: 


l 
fal) = en” for Ose <-, 
n 
(e) 2 hee P ] 
cyp=St-— 2 = 
fa (; )n or y 
, 2 
fue) =0 for <r <i, 
where to begin with we can choose any value for x, but must then keep this 


value of « fixed for all terms of the sequence. Graphically, our functions are 


y 


na-l 


O ae 2In 


Figure 7.5 To illustrate nonuniform convergence. 


represented by a roof-shaped figure made of two line segments lying over 
the interval 0 < v < 2/n of the x-axis, whereas from x = 2/n onward the 
graph is the x-axis itself (cf. Fig. 7.5). 

If x <1, the altitude of the highest point of the graph, which has in 
general the value n% 1, will tend to zero as n increases; the curves will then 
tend toward the «-axis, and the functions f, (vr) will converge uniformly to 
the limit function f(x) = 0. 

If x = 1, the peak of the graph will have the height 1 for every value of n. 
If x > 1, the height of the peak will increase beyond all bounds as 7 increases. 

However, no matter how « is chosen, the sequence f(x), fa), ... always 
tends to the limit function f(x) = 0. For, if x is positive, we have 2/n < x, 
for all sufficiently large values of n so that x is not under the roof-shaped part 
of the graph and /,(7) = 0; for x =0 all the functional values f(v) are 
equal to 0, so that in either case lim f (x) = 0. 


Now 


The convergence is certainly nonuniform, however, if « > 1; for it is 
plainly impossible to choose n so large that the expression | f) — fi) = 
f(x) is less than 4 everywhere in the interval. 
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(c) Exactly similar behavior is exhibited by the sequence of functions 
y y q 
fne) = ante", 


where, in contrast with the preceding case, each function of the sequence is 
represented by a single analytical expression. Here again the equation 
lim f,(~) = 0 holds for every positive value of x, since as n increases the 


UD 


Figure 7.6 Nonuniform convergence of the sequence f,(x) = n’xe-™. 


function e~"* tends to zero to a higher order than any power of 1/n (cf. 
Section 3.7b, p. 250). For x = 0, we have always /,(7) = 0, and thus 

f(x) = lim f,(«) = 0 

n—» 0O 
for every value of x in the interval 0 < x < a, where a is an arbitrary positive 
number. But here again the convergence to the limit function is not uniform. 
For at the point x = 1/n [where f(x) has its maximum] we have 
n1 


I 


and we thus recognize that if « > 1, the convergence is nonuniform, for 
every curve y = f,(x), no matter how large n is chosen, will contain points 
(namely, the point æ = 1/n, which varies with n) at which f (œ) — f(z) = 
ff, (©) > 1/2e (cf. Fig. 7.6). 


(d) The concepts of uniform and nonuniform convergence may, of course, 
be extended to an infinite series. We say that a series 


BX) + gol) + °° 
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is uniformly convergent, or not, according to the behavior of its partial sums 
f,(~). A very simple example of a nonuniformly convergent series is given by 
y2 y2 ye 
tA a ve 


2) = 2 +——. ———— R —-———— re 
ff) | +? 7 (1 + 2%)? E (1 +: ý 


For x = O every partial sum f(a) = a? +-+- + ?/(1 + x?)”™ has the value 
0; therefore f(0) = 0. For « #0 the series is simply a geometric series 


/ 
/f\(x) 
/ 


/ 


6 Jint) j 


lhw // 


Figure 7.7 Convergence to function with removable jump discontinuity. 


with the positive ratio 1/(1 +.«?) <1; we can therefore sum it by the 
elementary rules and thus obtain for every » # 0 the sum 


=1 +x. 
1—11 +?) di 
The limit function f(x) is thus given everywhere except at x = 0 by the 
expression f(r) = |] + wv?) whereas f(O) =0; it therefore has a removable 


discontinuity at the origin. 

Here again we have nonuniform convergence in every interval containing 
the origin. For the difference f(r) — f(r) = r (x) is always O for x = 0, 
whereas it is given by the expression r,(”) = I/(L +.*)""? for all other 
values of v, as the reader may verify for himself. If we require this expression 
to be less than, say 3, then for each fixed value of v this can be attained by 
choosing large enough. But we can find no value of » sufficiently large to 
ensure that r,(.r) is everywhere less than 3; for if we choose any value of n, 
no matter how large, we can make r,(”) greater than à} by taking + near 
enough to 0. A uniform approximation to within 3 is therefore impossible. 
The matter becomes clear if we consider the approximating curves (cf. Fig. 
7.7). These curves, except near « = 0, lie nearer and nearer to the parabola 
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y = 1 + z? as n increases; near x = 0, however, the curves send down a 
narrower and narrower extension to the origin, and as n increases this 
extension draws in closer and closer to a certain straight line, a portion of the 
y-axis, so that for the limiting curve we have the parabola plus a linear exten- 
sion reaching vertically down to the origin. 

As a further example of nonuniform convergence we mention the series 


> g(x), where g,(z) =x" — x”! for v > 1, go(x) = 1, defined in the 


v=0 


interval 0 <x < 1. The partial sums of this series are the functions 2” 
already considered in Example (a), p. 530. 


b. A Test of Uniform Convergence 


The preceding considerations show us that the uniform convergence 
of a sequence or series is a special property not possessed by all 
sequences and series. We now repeat the definition of uniform con- 
vergence as it applies to infinite series: the series 


2(%) + g(x) + °°: 


is uniformly convergent to a function f(x) in an interval if f(x) can be 
approximated to within a margin of approximation e (where e is an 
arbitrarily small positive number) by the sum of a fixed and sufficiently 
large number of terms g(x) + +++ + gy(x) = fy(x), independent of x 
in the interval. 

We again have a test (Cauchy’s test) for uniform convergence that 
does not require knowledge of the limit function f(z): the series con- 
verges uniformly (or equivalently, the sequence of functions f,(x) 
converges uniformly) if and only if the difference | f (x) — f,,(~)| can be 
made less than an arbitrary quantity e everywhere in the interval by 
choosing n and m larger than a number N independent of x. For, first, 
if the convergence is uniform, we can make |f, (x)— f(x)| and 
|fm(£) — f(x)| both less than «€/2 by choosing n and m greater than a 
number N independent of x, from which it follows that | f(a”) — fm) < 
e; and secondly, if | f (£) — f,,(«)| < e for all values of x whenever n 
and m are greater than N, then on choosing any fixed value of n > N 
and letting m increase beyond all bounds we have the relation 


IE) — SI = lim [f,(%) — fala) < €, 


for every value of x, so that the convergence is uniform. 

As we shall see it is just this condition cf uniform convergence that 
makes infinite series and other limiting processes with functions into 
convenient and useful tools of analysis. Fortunately, in the limiting 
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processes usually encountered in analysis and its applications, non- 
uniform convergence occurs only at isolated exceptional points and will 
scarcely trouble us for the present. 

Usually, the uniformity of convergence of a series is established 
by means of the following criterion (comparing the series with a 
majorant of constant terms): 


a0 

If the terms of the series X g(x) satisfy the condition |g\(x)| < a, 
v=1 

where the numbers a, are positive constants which form a convergent 


series > a,, then the series > g (x) converges uniformly (and absolutely). 


v=] v=1 
For we then have 


m m 


<J lg) <Ž a, 


v=n v=n 


m 


> g(x) 


1v=n 


m 
and since by Cauchy’s test the sum > a, can be made arbitrarily small 
ven 
by choosing n and m > n large enough, this expresses exactly the 
necessary and sufficient condition for uniform convergence. 

A first example is offered by the geometric series 1 +2 +2? +---, 
where xv is restricted to the interval || < q, q being any positive number less 
than 1. The terms of the series are then numerically less than or equal to the 
terms of the convergent geometric series Xq”. 

A further example is given by the “trigonometric series” 


cı sin (x — ô) casin (æ — 69) — cg sin (x — 43) 
cael | aia ial Raia a RR 


provided that |c,| < c, where c is a positive constant independent of n. For 
then we have 
Cn sin (x — ô) 


Cc 
5 , sothat |g,(z)] < = 


En) =: 


n 
Hence the uniform and absolute convergence of the trigonometric series 
120) 
c 
follows from the convergence of the series > z 


v=1 


c. Continuity of the Sum of a Uniformly Con- 
vergent Series of Continuous Functions 


The significance of uniform convergence lies in the fact that a 
uniformly convergent series in many respects behaves exactly like the 
sum of a finite number of functions. Thus, for example, the sum of a 
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finite number of continuous functions is itself continuous, and cor- 
respondingly we have the following theorem. 


THEOREM. If a series of continuous terms converges uniformly in an 
interval, its sum is also a continuous function. 
PROOF. The proof is quite simple. We subdivide the series 


fœ) = g2) + g(x) +> 


into the nth partial sum f(z) plus the remainder R(x). As usual, 
SAX) = g(x) +: + 2,(x). If now any positive number e is assigned, 
we can in virtue of the uniform convergence choose the number n so 
large that the remainder is less than €/4 throughout the whole interval, 
and hence 


IR (æ + h) — R,(2)] <5 


for every pair of numbers x and x + A in the interval. The partial sum 
J,(x) consists of the sum of a finite number of continuous functions and 
is therefore continuous; for each point x in the interval, therefore, we 
can choose a positive ô so small that 


Il +h) — FC <> 


provided |A| < 6 and the points x and x + A lie in the interval. It then 
follows that 


SE +h) — fl = [file + h) — fale) + RE + A) — R2) 
< Sa + A) — fi) + [Re + h) — R) < €, 
which expresses the continuity of our function. 


The importance of this theorem becomes clear when we recall that 
the sums of nonuniformly convergent series of continuous functions are 
not necessarily continuous from our previous examples. From the pre- 
ceding theorem we may conclude: if the sum of a convergent series of 
continuous functions has a point of discontinuity, then in every 
neighborhood of this point the convergence is nonuniform. Hence 
every representation of discontinuous functions by series of continuous 
functions must be based on the use of nonuniformly convergent limiting 
processes. 


d. Integration of Uniformly Convergent Series 


A sum of a finite number of continuous functions can be integrated 
“term by term”; that is, the integral of the sum obtained by integrating 
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each term separately and adding the integrals. In a convergent infinite 
series of continuous functions the same procedure is permissible, 
provided that the series converges uniformly in the interval of integra- 
tion. 


io.) 
A series > g(x) = f(x) which converges uniformly in an interval can be 
v=l 


integrated term by term in that interval; or, more precisely, if a and x 
are two numbers in the interval of uniform convergence, the series 


> ak g(t) dt converges, ang. in fact, converges uniformly with respect to 


T, Ji sum being equal to T (t) at. 


To prove this we write as before 


f(a) =È ga) = fyl2) + RC. 


We have assumed that the separate terms of the series are continuous; 
hence by Section 7.4c the sum is also continuous and therefore in- 
tegrable. Now if e is any positive number, we can find a number N so 
large that for every n > N the inequality |R,,(x)| < e holds for every 
value of x in the interval. By the mean value theorem of the integral 
calculus we have 


| | LA) — SD] dt 


where / is the length of the interval of integration. Since the in- 
tegration of the finite sum /,(%) can be performed term by term, this 
gives us 


<el. 


| f(t) dt — 3 ic dt 


But since «/ can be made as small as we please, this states that 


5 g) dt = lim > el) dt = = |o dt, 


n> v=) 


which was to be Doei 


1! Observe e in this theorem we must take definite integrals. Thus, for example, 

the series ¥ g(2) with g(x) = 0 converges uniformly; taking the indefinite in- 
v=1 

tegral f g,(£) dv = constant = c of each term, however, leads to the generally 


divergent series 5 C. 


v=1 
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If, instead of infinite series, we wish to deal with sequences of func- 
tions, our result can be expressed in the following way: 


If in an interval the sequence of functions f(x), f,(x), ... tends uni- 
formly to the limit function f(x), then 


(7) fr dx = lim IEG dx 


n> oO 


for every pair of numbers a and b lying in the interval; in other words, 
we can then interchange the order of the operations of integration and 
passing to the limit. 


This fact is not a triviality. From a naive point of view such as prevailed in 
the eighteenth century it is true that the interchangeability of the two processes 
is hardly to be doubted; but a glance at the examples in 7.4a shows us that 
in nonuniform convergence the preceding equation might not hold. We need 
only consider Example b, p. 530, in which the integral of the limit function 
is 0, whereas the integral of the function f,(x) over the intervalO < x <1, 
that is to say, the area of the triangle in Fig. 7.5, has the value 


1 
l fal) dr = n, 
0 


and when « > 2 this does not tend to zero. Here we immediately see 
1 

from the figure that the reason for the difference between Í f(x)dx and 
0 


1 
lim Í J,(z) dz lies in the nonuniformity of the convergence. 
n= 0 y0 


On the other hand, by considering values of « such that 1 < « < 2, we 
1 1 
see that the equation lim Í fa) dx = | f(x)dx can hold good although 
0 fo) 


N — N 0 
the convergence is nonuniform. As a further example, the series > En(Z), 


where g,(x) = x” — x"! for n > 1 andg,(x) = 1, can be integrated teri by 
term between the limits 0 and 1, even though it does not converge uniformly. 
Thus, although uniformity of convergence is a sufficient condition for term- 
by-term integrability, it is by no means a necessary condition. 


e. Differentiation of Infinite Series 


The behavior of uniformly convergent series or sequences with 
respect to differentiation is quite different from that with respect to 
sin nx 


integration. For example, the sequence of functions /,(z) = 


certainly converges uniformly to the limit function f(x) = 0, but the 
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derivative f, (x) = n cos n?x certainly does not converge everywhere 
to the derivative of the limit function f'(x) = 0, as we see by considering 
x = 0. In spite of the uniformity of the convergence, therefore, we 
cannot interchange the processes of differentiation and passage to 
the limit. 
Corresponding statements of course hold for infinite series. For 
example, the series 
sin 2*z | sin 34x 
py 3 


sin x + + 


is absolutely and uniformly convergent, for its terms are numerically not 


l l 
greater than the terms of the convergent series E + z + za +: 


If, however, we differentiate the series term by term, we obtain the 
series 
cos x + 2? cos 24x + 3? cos 344 +>, 


which plainly diverges at x = 0. 
The only useful criterion which assures us in special cases that term- 
by-term differentiation is permissible is given by the following theorem. 


io @) 

If, on differentiating a convergent infinite series X G(x) = F(x) term 
y—0 

2 term, we obtain a uniformly convergent series of continuous terms 


> g(x) = f(x), then the sum of this last series is equal to the derivative 
of ihe sum of the first series. 


This theorem therefore expressly requires that after differentiating the 
series term by term we must still investigate whether the result of the 
differentiation is a uniformly convergent series or not. 

The proof of the theorem is almost trivial. For by the theorem in 
Section 7.4d we can integrate term by term the series obtained by 
differentiation. Recalling that g(t) = G,'(t), we obtain 


[reoa=[" (S20) a =3 [sod =3G@ - aw 
e 


This being true for every value of x in the interval of uniform con- 
vergence, it follows that 
fæ) = F'@), 


which was to be proved. 
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7.5 Power Series 


Power series occupy a most important position among infinite series. 
By a power series we mean a series of the type 


(8) P(x) = co + cix + ea + °°: lS ea 

(“power series in x”), or more generally ie 

(8a) P(x) = cy + ¢,(@ — %) + c£ — 2) H = Sele — %)” 
v=0 


(“power series in (x — 2 )’’), where xy is a fixed number. If in the last 
series we introduce = x — 2, as a new variable, it becomes a power 


(s a) 


series > c,¢” in the new variable £, and we can therefore confine our 
v=0 © 
attention to power series of the more special form > c, x” without any 
loss of generality. ve 
In Chapter 5 (p. 446) we considered the approximate representation 
of functions by polynomials and were thus led to the expansion of 
functions in Taylor series, which are, in fact, power series. In this 
section we shall study power series in somewhat greater detail, and shall 
obtain the expansions of some of the most important functions in 
series more conveniently than before. 


a. Convergence Properties of Power Series—Interval of Convergence 


There are power series which converge for no value of x except, of 
course, for x = 0, as for example, the series 
x + 227? + 33878 +e ee tanta Hee. 
For if x # 0, we can find an integer N such that |z| > 1/N. Then all 
the terms n”x” for which n > N will be greater than | in absolute value, 
and, in fact, as n increases nx” will increase beyond all bounds, so that 
the series fails to converge. 
On the other hand, there are series which converge for every value of 
x; for example, the power series for the exponential function 
2 3 
e 
E=l+r+> t+: 
PA E 
whose convergence for every value of x follows at once from the ratio 
test (criterion 5a, p. 522). The (n + 1)th term divided by the nth term 
gives x/n, and, whatever number z is chosen, this ratio tends to zero 
as n increases. 
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The behavior of power series with regard to convergence is expressed 
in the following fundamental theorem. 


If a power series in x converges for a value x = &, it converges abso- 
lutely for every value x such that |x| < |&|, and the convergence is uniform 
in every interval |x| < n, where 7 is any positive number less than |$]. 
Here 7 may lie as near || as we please. 

The proof is simple. If the series & c," converges, its terms tend to 

y—0 
zero as n increases. From this follows the weaker statement that the 
terms all lie below a bound M independent of v, that is, |c,é"] < M. If 
now g is any number such that 0 < q < 1, and if we restrict x to the 
interval |r| <q él], then jez] < |c,é"1q’ < Mg’. In this interval, 


le 6) 


therefore, the terms of our series ÈX c x” are smaller in absolute value 
0 


than the terms of the convergent geometric series &2Mg’. Hence from 
the theorem on p. 535 the absolute and uniform convergence of the 
series in the interval —q |é| < x < q || follows. 

If a power series does not converge everywhere, that is, if there is a 
value x = & for which it diverges, it must diverge for every value of x 
such that |x| > |&|. For if it were convergent for such a value of x, by 
the theorem above it would have to converge for the numerically 
smaller value &. 

From this we recognize that a power series which converges for at 
least one value of x other than 0 and which diverges for at least one 
value of x has an interval of convergence; that is, a definite positive 
number p exists such that for |z| > p the series diverges and for 
|x| < p the series converges. For |x| = p no general statement can be 
made. Here p is just the /east upper bound of the values x for which 
the series converges (such a least upper bound exists by the theorem 
on p. 98 since the values x for which the series converges form a 
bounded set). The limiting cases, those in which the series converges 
only for x = 0 and those in which it converges everywhere, are ex- 
pressed symbolically by writing p = 0 and p = œ respectively.’ 


1 It is possible to find this interval of convergence directly from the coefficients 
c, of the series. If the limit lim ¥' |c,| exists, then 


n-> © 
1 


P a . Ci arn 
lim vV |c] 
nn 


For the general case, see Problem 8, p. 569. 
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For example, for the geometric series 1 + 2 + z? +--- we have p = 1; 
at the end points of the interval of convergence the series diverges. Similarly, 
for the series for the inverse tangent (p. 444), 

r AE: 
arc tanz =r —— +--+, 
ee 
we have p = 1, and at both the end points x = +1 of the interval of con- 


vergence the series converges, as we recognize at once from Leibnitz’s test 
(p. 514). 


From the uniform convergence we derive the important fact that 
within its interval of convergence (if such an interval exists) the power 
series represents a continuous function. 


b. Integration and Differentiation of Power Series 


Because of the uniformity of convergence it is always permissible to 
integrate a power series 


fle) = Yee" 


term by term over any Closed interval lying entirely within the interval 
of convergence. We thus obtain the function 


(9) F(z) = e +5} a", 
v=0 V 1 
for which F'(x) = f(x) and F(0)=c. 


We may also differentiate a power series term by term within its 
interval of convergence, thus obtaining the equation 


(10) f(z) = Sre, 


In order to prove this statement we need only show that the series 
on the right converges uniformly if x is restricted to an interval lying 
entirely within the interval of convergence. Suppose then that ¢ is a 

o 


number, lying as close to p as we please, for which > c,é” converges; 
‘ v=1 
then, as we have seen before, the numbers |c,é"| all lie below a bound 
M 
M independent of v, so that |c, ÆT! < — = N. Now let q be any 


g 
number such that 0 < q < 1; if we restrict x to the interval |z| < q |$], 
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the terms of the infinite series (10) are not greater than those of 


the series > |vc, gq’? 1], and therefore less than those of the series 
co v=] 
> Nvg’}. However, in this last series the ratio of the (n + 1)th term 


v=0 
to the nth term is g(m + 1)/n, which tends to q as n increases. Since 
0 <q < l, it follows [criterion (5a)] that this series converges. Hence 
the series obtained by differentiation converges uniformly, and by the 
theorem on p. 539 represents the derivative f'(x) of the function f(z), 
which proves our statement. 

If we apply this result again to the power series 


oe) 
f(x) = > vex’), 
v=] 


we find on differentiating term by term that 


[e @) 


f"(x) = Dy — Neva, 
v= 2 
and, continuing the process, we arrive at the theorem: Every function 
represented by a power series can be differentiated as often as we please 
within the interval of convergence, and the differentiation can be per- 
formed term by term. 


c. Operations with Power Series 


The preceding theorems on the behavior of power series are our 
justification for operating in the same way with power series as with 
polynomials. It is obvious that two power series can be added or sub- 
tracted by adding or subtracting the corresponding coefficients (see 
p. 520). It is also clear that a power series, like any other convergent 
series, can be multiplied by a constant factor by multiplying each term 
by that factor. On the other hand, the multiplication and division of 
two power series require somewhat more detailed study, for which we 


1 As an explicit expression for the Ath derivative we obtain 
le @) 
fP = Soy — No @—k + Ne 2, 
v=k 


or in a slightly different form, 


da = > h )s NETE > e + " ae 


These two formulas are frequently useful. 
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refer the reader to the Appendix (p. 555). Here we merely mention 
without proof that two power series 


fle) = Žar 


and 
g(x) = > bya” 


can be multiplied together like polynomials. To be specific, we have 
the following theorems: Throughout the common part of the intervals 
of convergence of these two series their product is given by the convergent 


(e o) 
power series > c£, where the coefficients c, are given by the formulas 


v=0 
Co = aobo, 
Cy = dobi + aibo, 
Co = aob + a,b, + abo, 


Cy = gb, + abn + °°* + adpbo, 


d. Uniqueness of Expansion 


In the theory of power series the following fact is of importance: if 


two power series È a,x” and > b,x” both converge in an interval which 
v=0 v=0 
contains the point x = 0 in its interior, and if in that interval the two 


series represent the same function f(x), then they are identical, that is, 
the equation a, = b, is true for every value of n. In other words: 


A function f(x) can be represented by a power series in x in only one 
way, if at all. 


Briefly: the representation of a function by a power series is “unique.” 
For the proof we need only notice that the difference of the two power 


o0 
series, that is, the power series $(z) = È ca” with coefficients c, = 
a, — b, represents the function v=0 


d(x) = f(x) — fE) = 0 


in the interval; that is, this last power series converges to the limit 0 
everywhere in the interval. For x = @, in particular, the sum of the 
series must be 0; that is, cy = 0, so that ag = by. We now differentiate 
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the series in the interior of the interval, obtaining ¢’(z) = > vc,2""1. 
v=] 

However, $'(x) is also 0 throughout the interval; hence for x = 0, in 
particular, we have c, = 0 or a, = b,. Continuing this process of 
differentiating and then putting x = 0, we find successively that all the 
coefficients c, are equal to zero, which proves the theorem. 

In addition, we can draw the following conclusion from our dis- 
cussion: if we take the vth derivative of a series f(x) = a,x” and then 
put x = 0, we at once obtain 


a, == f"), 
that is, Ps 


Every power series which converges for points other than x = Q is the 
Taylor series of the function which it represents. 


The uniqueness of the expansion corresponds to the fact that the 
coefficients can be expressed in terms of the function itself. 


*e. Analytic Functions 


For functions f(x) which can be expressed by power series, the name 
“analytic functions” has been used since the importance of such functions 
was first recognized by Lagrange. Specifically, f(x) is called analytic in the 
neighborhood of x = a if in this neighborhood an expansion of f(x) as a 
convergent power series in x — a is possible. 

While functions which are not at all or not everywhere analytic do play a 
great role in analysis and applications (See Chapter 8), the analytic functions 
are particularly important, for they share with polynomials many simple 
features. 

For example, an analytic function which does not vanish identically will 
have some nonvanishing derivative for x = a. Let r be the smallest number 
for which f(r)(a) # 0. Then f having a zero of order r at a point x =a, 
can be represented as a product f(x) = (x — a)’g(x), where g(x) is an 


analytic function for which g(a) = z fa) is different from zero. (Compare 


Chapter 5, p. 463.) Indeed, the possibility of factoring out the power (x — a)’ 
follows immediately from the convergence of the respective power series. 

Also, as is seen from the continuity of the convergent power series for g(x), 
the factor g(x) cannot vanish in a suitably small neighborhood of x = a, 
or: the zeros of f(x) are isolated unless, of course, f vanishes identically. 

Since the same is true of the function f(x) it follows that in a finite interval 
an analytic function is piecewise monotone, that is, it cannot change its 
character of monotonicity infinitely often; thus the graph of y = f(z) 
cannot have infinitely many intersections with a line y = constant (or any 
Straight line) in a finite interval. 
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One may note that these last statements are not neccessarily true for non- 
analytic functions, such as for y = sin (1/7) e~¥/", in the neighborhood of 
s = 0 (see p. 462). 


7.6 Expansion of Given Functions in Power Series. 
Method of Undetermined Coefficients. Examples 


Within its interval of convergence every power series represents a 
continuous function with continuous derivatives of all orders. We 
shall now discuss the converse problem of the expansion of a given 
function in a power series. In theory we can always do this by means of 
Taylor’s theorem; in practice we often meet with difficulties in the 
actual calculation of the nth derivative and in the estimation of the 
remainder. But we can often reach our goal more simply by making 
use of the following device. We first write down tentatively f(x) = 


œw 

> cx”, where the coefficients c, are unknown to begin with. Then by 
v=0 

some known property of the function f (x) we determine the coefficients, 
and then prove the convergence of the series. The series represents a 
function, and it only remains to prove that this function is identical 
with f(x). Because of the uniqueness of the expansion in power series 
we know that no other series than the one just found can be the re- 
quired expansion. Actually, we have earlier obtained the series for 
arc tan x and log (1 + x) by a method related to the idea of this chapter. 
For we simply integrated term by term the series for the derivatives 
of these functions, which we knew to be geometric series. We shall 
now consider some examples of this method. 


a. The Exponential Function 


As we saw in Chapter 3, Section 4a, p. 223, the function y = e” is com- 
pletely characterized by the differential equation y’ = y and the initial con- 
dition y = 1 for x = 0. We can use these properties directly to find the 
power series for the exponential function. Our problem is to find a function 
f(x) for which f'(x) = f(x) and f(0) = 1. If we write tentatively the series 
with undetermined coefficients 


fE) = Cy tarte H. 
and differentiate it, we obtain 
fŒ) = C + 2c,x + 3c? +e, 


Since by hypothesis these two power series must be identical, we have the 
equation 
NCy = Cu—i» 
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true for all values of > 1. If we observe that because of the relation 
f(O) = 1 the coefficient co must have the value 1, we can calculate all the 
coefficients successively, and obtain the power series 


x a z’ 
ee 


f@) =1+ 
As we easily see by the ratio test, this series converges for all values of x 
and therefore represents a function for which the relations f'(x) = f(x), 
f(O) = 1 are actually fulfilled. (Here we intentionally avoid making any 
use of what we have previously learned about the expansion of the exponential 
function.) 
Since only the function e” possesses these properties we readily deduce that 
the function f(x) is identical with e”. 


b. The Binomial Series 


We can now return to the binomial series (Section 5.5c, p. 456), this 
time making use of the method of undetermined coefficients. We wish 
to expand the function f(x) = (1 + 2)* in a power series, and therefore 
write 

f(x) =( +2) =o tee tet +---, 


the coefficients c, being undetermined. We now notice that our function 
obviously satisfies the relation 


(1 + x) f(a) = af(x) => aca. 


v=0 


On the other hand, if we differentiate the series for f(x) term by term and 
multiply by (1 + x), we obtain 


(1 +x) f(a) =c + (2cg + cir + Beg + 2e? +°°-; 
and since these two power series for (1 + x) f'(x) must be identical, 
Cy = Cy, ACi = 2€g + Cj, Oly = 3Cg + 2€0,.... 


Now it is certain that cy = 1, since our series must have the value 1 forz = 0, 
and so we obtain in succession the expressions 


_ — la _ (a — 2) — 1)a 
=e 3 nnn a a 


Cy = a, Co 


for the coefficients, and in general, as is easily established, we have 


Coe 


C» vy — 1)--:2.-1 


v 
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(a 6] oo 
Substituting these values for the coefficients, we have the series > ( Jes 
v=0 v 
we have yet to investigate the convergence of this series and to show that it 
actually represents (1 + 2)*. 
By the ratio test we find that when « is not a positive integer, the series 


converges if |x| < 1 and diverges if |z| > 1; for then the ratio of the ( + 1)th 


 a-n+i) 
term to the nth term is maa ae and the absolute value of this expression 


tends to |x| as n increases beyond all bounds.’ Hence, if |x| < 1 our series 
represents a function f(x) which satisfies the condition (1 + x) f(x) = af (x), 
as follows from the method of forming the coefficients. Moreover, f(0) = 1. 
Together, these two conditions ensure that the function f(x) is identical with 


(1 +x)". For on putting 
(x) = eee. 


KETI 
we find that 
(1 +f) — (1 HAE) 


a + xa 0; 


¢ (x) = 
¢(x) is therefore a constant, and, in fact, is always equal to 1, since ¢(0) = 1. 
We have therefore proved that for |x| < 1 


(taynd (Fe, 


v=0 
which is the binomial series. 
Here we note the following special cases of the binomial series; the 
geometric series 


a +xeryt=al—-ate—-x#e+aet—+-:--: 


the series 


isa” +x)? = 1] — 2x + 3r? — 43 + — --- 


=> (-D + De", 


v=0 


1 Here we state, without proof, the exact conditions under which this series converges. 
If the index « is an integer >0, the series terminates and is therefore valid for all 
values of x (becoming the ordinary binomial theorem). For all other values of « the 
series is absolutely convergent for |x| < 1 and divergent for |x| > 1. For x = +1 
the series converges absolutely if « > 0, converges conditionally if —1 < « < 0, and 
diverges if « < —1. Finally, at z = —1 the series is absolutely convergent if x > 0, 
divergent if « < 0. 
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which may also be obtained from the geometric series by differentiation 
and the series 


ee ee ee ae 
TE) RY aN eae er eg rae 


13-5, 
age ks 
—— = (1 + os we Pee 
Va +2) eee a 
1-3-5-7 
+ 


oe NEAS 
DAA T ES 


the first two or three terms of which form useful approximations 
c. The Series for arc sin x 


This series can be obtained very easily by expanding the expression 
Vd — t?) according to the binomial series, 


l -2 -1452+ r 
(es ma 


This series converges if |t| < 


< 1, and so converges uniformly if |t| <q <1. 
On integrating term by term Between 0 and x, we obtain 


l -3u 
arc sin £ = x a mee Possi 


by the ratio test we find that this converges if |r| < 1, and diverges if |x| > 1 


x g 
The deduction of this series from Taylor's theorem would be decidedly 
less convenient, owing to the difficulty of estimating the remainder 


d. The Series for ar sinh x = log[x + vV + x*)] 


We obtain this expansion by a similar method. Using the binomial 
theorem we write down the series for the derivative of ar sinh x 


1 1 1-3 1:3-5 
——————— E r — 
Vi +z? 2 2-4 


ae ee 
2-4-6 


and then integrate term by term. We thus obtain the expansion 


bee. 3a 


ar sinh x = x E Pear 


— +: 


whose interval of convergence is —1 < x < 1. 
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e. Example of Multiplication of Series 


The expansion of the function 
log (1 + x) 
l +r 


is a simple example of the application of the rule for the multiplication of 
power series. We have only to multiply the logarithmic series 


a2 r gt 
lor tiere SoS ae 
Ob Sieg ar arg Se 


by the geometric series 


1 
l +r 


oa AA i mod pd ote: 
=] =v +a? — r3 444 — 4+ 


as the reader may verify for himself, we obtain the remarkable expansion 


log (1 + *) _ 


l+ x= (1 +E + (1 + 4 +a’ 


(Coe gehts 
for |z| < 1. 


f. Example of Term-by-Term Integration (Elliptic Integral) 


In previous applications pp. 300, 411 we have met with the elliptic integral 
7/2 d 
K -Í POE NE for (k? < 1) 
o V(l — k®sin? ẹ) 


[the period of oscillation of a pendulum]. In order to evaluate the integral 
we can first expand the integrand by the binomial theorem, thus obtaining 


| , rS 
— = 1 + tk? sin? 6 + — kt sint ¢ 
V(I — Ksin? $) 2-4 
35 
3-4-6 


kê sinf ġ +>. 


Since k? sin? $ is never greater than k? this series converges uniformly for 
all values of ¢, and we may integrate term by term: 


7/2 dd 7/2 1 7/2 
K={ — -Í do ref sin? ¢ dd 
o Vil —k?sin? $) Jo 2 Jo 


1:3 i 7/2 i 
— i d ee, 
rial sin* ddd + 
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The integrals occurring here have already been calculated [cf. Eq. (76), 
p. 279]. If we substitute their values, we have 


K= Í "e o de r Í È (5) A (je 
o Vi — kèsin? $) 2 2, 2:4 
+ (aa)e +. 
2-4-6 


7.7 Power Series with Complex Terms 


a. Introduction of Complex Terms into Power Series. 
Complex Representations of the Trigonometric Functions 


The similarity between certain power series representing functions 
which are apparently unrelated led Euler to a purely formal connection 
between them by giving complex values, in particular, pure imaginary 
values, to the variable x. We shall first describe Euler’s formal, but 
most striking and fruitful discovery, unhindered by questions of rigor. We 
shall then indicate a more rigorous justification. 

The first relation of this sort is obtained if we replace the quantity x 
in the series for e” by a pure imaginary id, where ¢ is a real number. If 
we recall the fundamental equation for the imaginary unit i, that is, 
i? = —], from which = —i, it = 1,  =i7,... follows, then on 
separating the real and the imaginary terms of the series, we obtain 


a (i-£4£_£4_...) 


2! 4! 6! 
3 5 7 
+i -Ë E E4), 
3! 5! 7! 
or in another form, 
(11) e* = cos ¢ + isin œ. 


This is the well-known and important “Euler formula,” a landmark in 
analysis; as yet it is purely formal.’ It is consistent with De Moivre’s 
theorem (p. 105), which is expressed by the equation 


(cos $ + isin d)(cos y + isin y) = cos(d + y) + isin ($ + y). 


By virtue of Euler’s formula this equation merely states that the 
relation 


continues to hold for pure imaginary values x = id, y = iy. 


1 One consequence for ¢ = ~ is the formula e7* = —1, a striking relation between 
the three most important constants e, m and i. 
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It should be stated that this Euler formula and the addition theorem 
ee = eit) may be used rigorously without further justification 
simply by defining e'* as the complex number cos ¢ + i sing. This 
definition is consistent with the ordinary rules for operating with 
exponentials. In particular, the ordinary rule for multiplying powers of 
e just furnishes simple concise expressions of the addition theorems 
of trigonometry as expressed by de Moivre’s formula which in turn is 
of an entirely elementary character. Therefore we are on safe ground 
when we make use of Euler’s relations without the benefit of a more 
general analysis of functions of a complex variable, as in the next 
section. 

More generally we can define the exponential function for an 
arbitrary complex exponent x + iy (where x and y are real) by the 
formula 

ett = ee” = e*(cosy + isin y). 


If we replace the variable z in the power series for cos x by the pure 
imaginary ix we at once obtain the series for cosh x; this relation can 
be expressed by the equation 


(12) cosh x = cos ix. 


In the same way we obtain 


(13) sinh x = ee ix, 
i 


Since Euler’s formula also gives e~’? = cos ¢ — isin ¢, we arrive 
at the exponential expressions for the trigonometric functions, 


f it et eT et 
(14) sin z = —————_ , ene Se. 
2i 2 
These are exactly analogous to the exponential expressions for the 
hyperbolic functions and are, in fact, transformed into them by the 
. ; Lo 
relations cosh z = cos iz, sinh z = — sin iz. 
i 


Corresponding formal relations can, of course, be obtained for 
the functions tan z, tanh x, cot x, coth x, which are connected by the 


; l 
equations tanh 2 = - tan ix, coth x = i cot iz. 
i 
Finally, similar relations can also be found for the inverse trigono- 
metric and hyperbolic functions. For example, from 
tz et et =] 


y = tan t = — ~ = rar rE 
ile" J e~t") i(e ix 4 1) 
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we immediately find that 
air — I T iy 
1 — iy 


e 


If we take the logarithms of both sides of this equation and then write 
x instead of y and arc tan x instead of x, we obtain the equation 


1+ iz 


a) 
1 — iz 


(15) arc tan x = = log 
i 


which expresses a remarkable connection between the inverse tangent 
l+ 
. 444 

Ty Pem 
we replace x by ix, we actually obtain the power series for arc tan 2, 

1 . 33 - \5 

arc tan x = (w+ Z) 
i 


| 
and the logarithm. Ifin the known power series for 5 log 


These relations are as yet of a purely formal character and naturally 
call for a more exact statement of the meaning they are intended to 
convey. We have, however, seen above that by using proper defi- 
nitions these relations acquire a satisfactorily rigorous meaning. 


*b. A Glance at the General Theory of 
Functions of a Complex Variable 


Although the purely formal point of view indicated in the last Section 
is in itself free from objection, it is still desirable to recognize in the 
preceding formulas something more than a mere formal connection. 
This goal leads to the general theory of complex functions, as (for the 
sake of brevity) we call the general theory of the so-called analytic 
functions of a complex variable. As our starting point we may use a 
general discussion of the theory of power series with complex variables 
and complex coefficients. The construction of such a theory of power 
series offers no difficulty once we define the concept of limit in the 
domain of complex numbers; in fact, it parallels the theory of real 
power series almost exactly. However, as we shall not make any use 
of these matters in what follows we shall content ourselves here by 
stating certain facts, omitting proofs. It is found that the following 
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generalization of the theorem of Section 7.5a, holds for the complex 
power Series: 


If a power series converges for any complex value x = & whatever, then 
it converges absolutely for every value x for which |x| < ||; if it diverges 
for a value x = &, then it diverges for every value x for which |x| > ||. A 
power series which does not converge everywhere, but does converge for 
some other point in addition to x = Q, possesses a circle of convergence, 
that is, there exists a number p> 0 such that the series converges 
absolutely for |x| < p and diverges for |x| > p. 


Having once established the concept of functions of a complex 
variable represented by power series, and having developed the rules 
for operating with such functions, we can think of the functions e”, 
sin x, cos x, arc tan 2, etc., of the complex variable x as simply defined 
by the power series which represent them for real values of zx. 

We shall indicate by two examples how this introduction of complex 
variables illuminates the behavior of the elementary functions. 
The geometric series for 1/(1 + x?) ceases to converge when x leaves 
the interval —1 < x < 1, and so does the series for arc tan 2, although 
there are no peculiarities in the behavior of these functions at the ends 
of the interval of convergence; in fact, they and all their derivatives are 
continuous for all real values of x. On the other hand, we can readily 
understand that the series for 1/(1 — z?) and log (1 — x) cease to con- 
verge as x passes through the value 1, since they become infinite there. 

But the divergence of the series for the inverse tangent and the series 


> (—1)’2®" for |x| > 1 immediately becomes clear if we consider com- 
v=0 


plex values of x also. For we find that when x = i the functions become 
infinite and so cannot be represented by a convergent series. Hence 
by our theorem about the circle of convergence the series must diverge 
for all values of x such that |x| > |i] = 1; in particular, for real 
values of x the series diverge outside the interval —1 < z < 1. 

Another example is given by the function f(z) = e7? for x # 0, 
J (0) = 0 (see p. 462), which, in spite of its completely smooth behavior, 
cannot be expanded in a Taylor series. As a matter of fact, this function 
ceases to be continuous if we take pure imaginary values of x = i$ 
into account. The function then takes the form e?/* and increases 
beyond all bounds as ë — 0. Itis therefore clear that no power series in 
x can represent this function for all complex values of x in a neighbor- 
hood of the origin, no matter how small a neighborhood we choose. 

These remarks on the theory of functions and power series of a 
complex variable must suffice for us here. 
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Appendix 


*A.1 Multiplication and Division of Series 


a. Multiplication of Absolutely Convergent Series 


o 


Let A= Ya, B=}b, 
v=) ; 


be two absolutely convergent series. Together with these we consider the 
corresponding convergent series of absolute values 


A =>} |a,| and B=) |b.. 
v=0 ve-Q) 


We further put 


n-1 n-1 Š n-i n—1 
A, = > a,, By, = ba An = > |a,|, B, = > lb, 
r=0 r=0 r=0 v=0 
and Cn = Agbn + abn + °° + abo- 


(e9) 
We assert that the series È c, is absolutely convergent, and that its sum 
is equal to AB. oe 
To prove this, we write down the series 


Agby + aibo + a,b, + agd, + abo + aob: 
+ abo + abo + agb + ss + Andy + and, 
++: + a,b, +e: + ayb, + agb,, +>, 


the n’th partial sum of which is A,,B,, and we assert that it converges 
absolutely. For the partial sums of the corresponding series with absolute 
values increase monotonically; the nth partial sum is equal to 4,,B,, which 
is less than AB (and which tends to AB). The series with absolute values 
therefore converges, and the series written down above converges absolutely. 
The sum of the series is obviously AB, since its n*th partial sum is A,B,, 
which tends to AB as n + œ. We now interchange the order of the terms, 
which is permissible for absolutely convergent series, and bracket successive 
terms together. In a convergent series we may bracket successive terms 
together in as many places as we desire without disturbing the convergence or 
altering the sum of the series, for if we bracket together, say, all the terms 
(anyi + an2 +’: + am), then when we form the partial sums we shall 
omit those partial sums that originally fell between s, and s,,, which does not 
affect the convergence or change the value of the limit. Also, if the series 
was absolutely convergent before the brackets were inserted, it remains 
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absolutely convergent. Since the series 


2 Cy = (aobo) + (aob; + aibo) + (aob + aibi + aabo) +: 
r=0 


is formed in this way from the series written down above, the required proof 
is complete. 


*b. Multiplication and Division of Power Series 


The principal use of our theorem is found in the theory of power series. 
The following assertion is an immediate consequence of it: The product of 
the two power series 


oO 00 
Š a" and > b,” 
v=-() v= 


is represented in the interval of convergence common to the two power 


les) 


series by a third power series > c,x", whose coefficients are given by 
r=0 


Cy = aob, + aibs +: + abo. 


* As for the division of power series, we can likewise represent the quotient 


[e @) 
of the two power series above by a power series > g,x", provided by, the 
v == () 


constant term in the denominator, does not vanish. (In the latter case such 
a representation is in general impossible; for it could not converge at x = 0 
on account of the vanishing of the denominator, whereas on the other hand, 
every power series must converge at x = 0.) The coefficients of the power 
series 


[e 9) 
È ge 
v=0 


can be calculated by remembering that > qg,%” : È ba” = Ý a,x", so that 
v=0 v=0 r=0 
the following equations must be true: 
ao = Gob, 
a, = Gob, + Ibo, 
az = qob + 9b, + abo, 
a, = Jobs + qib- + °° Egbo 
From the first of these equations gp is readily found, from the second we find 
the value g,, from the third (by using the values of gp and q,) we find the value 


Jz, etc. In order to give strict justification for the expression of the quotient 
of two power series by the third power series we have to investigate the 


(ce) 
convergence of the formally-calculated power series > 9,7”. However, we 
v=0 
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shal] make no further use of the result and content ourselves with the state- 
ment that the series for the quotient does actually converge in some interval 
about the origin. The proof is omitted. 


A.2 Infinite Series and Improper Integrals 


The infinite series and the concepts developed in connection with 
them have simple applications and analogies in the theory of improper 
integrals (cf. Chapter 4, p. 301). We confine ourselves to the case of a 
convergent integral with an infinite interval of integration, say an 


integral of the form i f(x)dx. If we divide the interval of integration 
0 


by a sequence of numbers x, = 0, 2,,... tending monotonically to 
+œ, we can write the improper integral in the form 


[Cf@desatate, 
0 


where each term of our infinite series is an integral; 


a = [7 dx, a, =|" 40 dz,..., 


and so on. This is true no matter how we choose the points x,. We 
can therefore relate the idea of a convergent improper integral to that 
of an infinite series in many ways. 
It is especially convenient to choose the points x, in such a way that the 
integrand does not change sign within any individual subinterval. The 
cO 
series > |a,| then corresponds to the integral of the absolute value of 


v=1 


our function, 
[Pela 
0 


We are thus naturally led to the following concept: an improper 


integral | f (x)dx is said to be absolutely convergent if the integral 
0 


ie 6) 
f |f(x)| dx converges. Otherwise, if our integral exists at all, we say 
0 


that it is conditionally convergent. 
Some of the integrals considered earlier (pp. 307 to 309), such as 


i dx, Í ee" dx, I(x) -| et! dt, 
o1l+ 2° 0 0 


are absolutely convergent. 
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On the other hand, the important “Dirichlet” integral 


O a; Aai 
z= SRE ge = tim | S22 ae, 
0 


x A7co J0 x 


studied on p. 309, is the typical example of a conditionally convergent 
integral. The simplest proof of convergence is by reduction to an 
absolutely convergent integral: We write sinx = (l —cosz) = 
2(sin? x/2)’ and use integration by parts, transforming J into the abso- 


lutely convergent form 
J= 2| (sin? 2) dz. 
o 2/ x? 


(Note that the new integrand approaches continuously the limit 4 for 
x — 0 and vanishes of the order x~? for x — oœ.) 

*A different proof of the convergence is obtained if we subdivide the 
interval from 0 to A at the points x, = vr(» = 0,1, 2,..., 44), where 
ua is the largest possible integer for which uam < A. We therefore 
divide the integral into terms of the form 


"sing 
a, =| dx, for we 1, 2.32225 
( 


v—l1)7 x 


and a remainder R, of the form 


Ao 
Í SaF de (0< A-— puan <7). 
nat TX 


Obviously, the quantities a, have alternating signs, since sin x is 
alternately positive and negative in consecutive intervals. Moreover, 
la| < |a,|; for on applying the transformation x =  — 7, we have 


la, -|" Isin z| 4y "i Isin (E — )| yg E Isin £l je 
( y 3 


yv—l) £ 7 £ — 7 T = 
(v+l)7 te: 
sin é 
> f Sel Je A 
va é 


Hence by Leibnitz’s test we see that Xa, converges. Moreover, the 
remainder R4 has the absolute value 


Ao (u4t+1)7 fe: 
sın T Sin v 
| dz <Í em dx 
nat X HAT x 


(ugtl)z 
<>| T EN 


Hat Hat 


[Rul = 
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and this tends to 0 as A increases. Thus, if we let A tend to œ in the 
equation 


A ° 

sin x 

Í dx = a, + az +a +`: +a, + Ry, 
0 x 

the right-hand side tends to 2a, as a limit, and our integral is convergent. 

But the convergence is not absolute for 


=  -|sin x 2 ; 
la, | >| ium dx = =, so that È |a,| diverges. 
(v—1)z” 


VT VIT 


*A.3 Infinite Products 


In the introduction to this chapter (p. 511), we stated that infinite 
series are only one way, although a particularly important one, of 
representing numbers or functions by infinite processes. As an example 
of another such process, we consider infinite products. No proofs will 
be given. 

On p. 281 we encountered Wallis’s product, 


in which the number 7/2 is expressed as an “‘infinite product.” Gener- 
ally speaking, by the value of the infinite product 
TI a, = 4, az ` a3 ' ay°°* 


v=1 


we mean the limit of the sequence of “‘partial products” 
dis a; ` dz, dı ` d2 ` A3, Ai ' Az ° Ag ` Age., 


provided it exists. 

The factors a,, a», a3,..., of course, may also be functions of a 
variable x. An especially interesting example is the “infinite product” 
for the function sin 2, 


. a r r 
(16) sine = rali -i -5-5 


which we shall obtain in Section 8.5, p. 603. 


The infinite product for the zeta function plays a very important role 
in the theory of numbers. In order to retain the notation usual in the theory 
of numbers we here denote the independent variable by s, and we define the 
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zeta function for s > 1, following Riemann, by the expression 


v% 1 
g(s) =e =e 
We know (Section 7.2c, p. 525) that the series on the right converges if s > 1. 
If p is any number greater than 1, we obtain the equation 
1 4 1 1 l 
7 p ae ot 7 + 
ke 

by expanding the left-hand side in a geometric series with the quotient p™*. 
If we imagine this series written down for all the prime numbers p}, Po, P3, - - - 
in increasing order of magnitude, and all the equations thus formed multiplied 
together, we obtain on the left a product of the form 


1 1 
ey ey ia as 


Without stopping to justify the process, we multiply together the series on 
the right-hand sides of our equations; we obtain a sum of terms 


—k1s,—kes,—k -s 
Pi map, a 98 hoe SS (py p:"?pa" oe aoe 
where k, ko, k3, . . . are any nonnegative integers: also we remember that by 


an elementary theorem each integer n > 1 can be expressed in one and only 
one way as a product of powers of different prime numbers n = p,“'p.*? -> - 
Thus we find that the product on the right is again the function ¢(s), and so we 
obtain the remarkable “product form” of Euler 


1 
(17) = spe a 
Tři 
This “product form,” the derivation of which we have only briefly sketched 
here, is actually an expression of the zeta function as an infinite product, since 
the number of prime numbers is infinite. 


In the general theory of infinite products one usually excludes the 
case where the product aja, ''' a„ has the limit zero. Hence it is 
specially important that none of the factors a, should vanish. 
In order that the product may converge, the factors a, must 
accordingly tend to 1 as n increases. Since we can if necessary omit a 
finite number of factors (this has no bearing on the question of con- 
vergence), we may assume a, > 0. The following almost trivial theorem 
applies to this case: 

A necessary and sufficient condition for the convergence of the 


product [] a,, where a, > 0, is that the series > log a, should converge. 
v=] n v=] 
For the partial sums È log a, = log (a,a, - - - a„) of this series will tend 


v=] 
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to a definite limit if, and only if, the partial products a,a, : + - a, possess 
a positive limit, as a consequence of the continuity of the logarithm. 

In studying convergence the following sufficient condition usually 
applies, where a, = 1 + a,. The product 


a0 


IT + 4,) 


v=l1 


converges, if the series 


converges and no factor (1 + «,) is zero. In the proof we may assume, 

after omission of a finite number of factors if necessary, that each 

la,|] <4. Then we have | — |«,| > 4. By the mean value theorem 

log (1 + A) = log (1 + A) — log 1 = A/(1 + 8h) with0 < 0 < 1. 
Therefore 

ay 


ee ees [æl < 2 lx, |, 
1+ 6a, 


~ f= fayf 


llog (1 + x)| -| 


and so the convergence of the series > log (1 + «,) follows from the 
w) v=1 

convergence of > |«,]. 
v=] 

From our criterion it follows that the infinite product (16) above for 
sin 7a converges for all values of x except for x =0, +1, +2, ..., where 
factors of the product are zero. As to the Riemann ¢-function, for p > 2 and 
s > 1 we readily find that 


: 1+ 0 < ; : 
l- p” p-r p—l pP 


1 
Now if we let p assume all prime values, the series È 7 must converge, since 


aci 
its terms form only a part of the convergent series > a The convergence 
v=] 
of the product in Eq. (17) for s > 1 is thus proved. From the fact that the 
series for ¢(s) for s = 1 (that is, the harmonic series) diverges, we can draw 
the remarkable conclusion that the series of reciprocal prime numbers, that 
is, the series 


diverges. (Incidentally, this shows that the number of primes is infinite.) 
Indeed, if the series of reciprocal primes were convergent, then also the series 
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with terms 
1 i Px? 


unien eee 


would be convergent, since py > 2 and 
0 < ak <$2p,7. 


Then, by our test, also the infinite product 


œ © 1 (e) 1 1 
(1 +4,) = —_——— = (1+ ++--] 
U Op tat 
would be convergent; but then clearly the harmonic series would converge as 
well which is impossible. 


*A.4 Series Involving Bernoulli Numbers 


So far we have given no expansions in power series for certain elementary 
functions, for example, tan x. The reason is that the numerical coefficients 
which occur are not of any simple form. We can express these coefficients, 
and those in the series for a number of other functions, in terms of the so- 
called Bernoulli numbers. These are curious rational numbers, with a 
somewhat hidden law of formation, which occur in many parts of analysis. 
The simplest way to arrive at them is by expanding the function 


in a formal power series of the form 


x o B 


IS fe a 


t 
v=o "+ 


If we write this equation in the form 
iv) B,* 
r= (e* = 1) > ah x” 


v=0 


and substitute on the right the power series for e” — 1, we obtain for the B,*, 
a recurrence relation 


n+l n +1 n+1 n+l 
* * * iene B.* = 
( )a, +( : Jan, +( Jan, + ee *=0 
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for n > 0, Bọ* = 1 from which the B,* can easily be calculated successively. 
These rational numbers are called Bernoulli numbers. They are rational 
since in their formation only rational operations are concerned; as we easily 
recognize, they vanish for all odd indices other than v = 1. The first few are 


B,* = l, B,* = =$, B,* = 3, B,* = — 3, B,* = ds, 
B,* = — zy, By" 665. Sos 
We must content ourselves with a brief hint as to how these numbers 


are involved in the power series in question. First, by making use of the 
transformation 


we obtain 


(This formula proves that By, 41 = 0 for v > 0, since (#/2)coth(x/2) is an 
even function of zx.) 
If we replace x by 2x, we have the series 

co 2°" B x 
ne is 2v 
x coth x E (>)! ey 
valid, as can be shown, for |x| < m, from which, by replacing x by —ix, we 
obtain (cf. p. 552) 


x| < r. 


oD 22" B; * 
= — |) Y y2v 
x cot x =È! 1) Ge)! IAA 
By means of the equation 2 cot 2x = cotx — tanx we now obtain the 
series 
22¥(22¥ SP 1) 
(2v)! 


* 2y—-1 
B,, x 9 


tanx = > (1)! 
v=] 


which holds for || < 


For further information we refer the reader to Chapter 8 and to more 
detailed treatises.” 


1 In a slightly different notation (p. 623), the basic formula will be written 


ea) B, 
=] — 4x + 2 RT 


£ 
=] 


2 See, for example, K. Knopp, Theory and Application of Infinite Series, p. 183, 
Blackie & Son, Ltd., 1928 and K. Knopp, Infinite Sequences and Series, Dover 
Publications, 1956. 
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PROBLEMS 


SECTION 7.1, page 511 
1. Prove that 


= 1 1 1 
> =a > Eo a +: +=1 
y= 


[cf. Problems 1.6, 12(a)] and use the result to prove > 5 converges. 


v=] 


2. Use the result of Problem 1 to obtain upper and lower bounds for 


xi 
ya 
x 2» +3 
3. Prove that —1)¥ ——______. = 
2, ) (» + 1) + 2) 
1 
4. For what values of « does the series 1 — za + z — x +: <- converge? 
5. Prove that if > a, converges, and s, = a, ta, +: +a, then the 
sequence we 
Sy Sg +++ Sy 
N 
o0 
also converges, and has ` a, as its limit. 
v=1 
= 2n 2n — 1 
. Ist i —__ — — t? 
6. Is the series 2 (z | F ) convergen 


7. Is the series > (-1)’ i convergent ? 
y=1 Y + 1 


co o0 


8. Prove that if >, a,” converges, so does > 


v=1 v=1 


ay 


9. (a) If a, is a monotonic increasing sequence with positive terms, when 
sol l 
does the series — + —— + 
4, Aa} A1203 
(b) Give an example of a monotone decreasing sequence with lim a„ = 1 
for which the series diverges. aia 
(c) Show that if decreasing sequences are allowed, then it is possible to 


obtain convergent sums even when lim a, = 1. 
n— © 


+--+ converge? 


(e 0) 


10. If the series > a, with decreasing positive terms converges, then 


lim na, = 0. v= 
n—> © 
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11. Show that the series 5 sin — ~ diverges. 


v=l1 
12. Prove that if Xa geDnvelees and if b4, by, bs, . . . is a bounded monotonic 
sequence of numbers, then 2a,b, converges. Moreover, prove that if S = 
La,b, and if La, < M, then |S] < Mb. 


13. A sequence {a,} is said to be of bounded variation if the series 


œ 
> lai} — a;l 
i=l 


converges. 

(a) Prove that if the sequence {a,} is of bounded variation, then the 
sequence {a,} converges. 

(b) Find a divergent infinite series 2a; whose elements a, constitute a 
sequence which is of bounded variation. 

(c) Prove the following generalization of Abel’s convergence test (see page 
515) due to Dedekind: 

The series Ła;p; is convergent if Xa; oscillates between finite bounds and 
{p;} is a null sequence which is of bounded variation. 

(d) Prove the convergence of the following infinite series: 


(a) ig 


(oy — 
n=2 

for z any a real number. 
14. Discuss the convergence or divergence of the following series: 


gz a)y 
—j|)» Ofr v v0 

TDA 1) a ose =r cos 

(c) > (> 


15. Find A sums of the following eee of the series 
Ld Po R. 


sin nx 


(—1)"; 


COs ue 


sin v6 


cos 76 = sin v6 


a P ee a aT 
for log 2: 
(@1—-}-243-$-$4+4-H—-wt-- 
(6) CAS ee a eee es 
16. Find whether the following series converge or diverge: 
COM i ge am ae ak ieee a ttot ae 
OSEE et Pete eee Pee Fe tS 


SECTION 7.2, page 520 
. Prove that > converges when « > 1 and diverges when 
l. 


1 _— 
ie v(log v)? 
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S 1 
2. Prove that anno 
when « < 1. 2 v log v(log log »)# 


3. Prove that if n is an arbitrary integer greater than 1 


converges when « > 1 and diverges 


5 a,” — log n 
y bd 
where a,” is defined as follows: 


_ |l if is not a factor of v, 


a," = , : 
K —(n — 1) if n is a factor of v. 


< log (vy + 1) — log» 
4. Show that 2, —————————— : 
ow tha 2 (log > converges 
= 1-2-3-7% 


5. Show that > 


crete) oe ee 


diverges if « < 1. 
co 1 
*6. By comparison with the series >, za > Prove the following test: 
v=l 
yp 08 Allan) 
log n 
and for every sufficiently large n, the series La, converges absolutely; if 
l 
og (llan) — 
logn 
independent of n, the series Za, does not converge absolutely. 


> l + for some fixed number e > 0 independent of n, 


— e for every sufficiently large n and some number « > 0 


7. Show that the series > ( E +) converges. 
y=l1 v 
8. For what values of « do the following series converge? 
1 1 1,1 l 
(a) 1 — Ja t3 ~~ Ga tz ane rics a . 
1 l 1 1 1 


9. By comparison with the series X , prove the following test: 


1 
(log v)* 
The series È |a,| converges or diverges according as 
log (1/n lanl) 
log log n 
is greater than 1 + ¢ or less than 1 — e for every sufficiently large n. 
10. Derive the nth root test from the test of Problem 6. 


11. Prove the following comparison test: if tne series 2b, of positive terms 
converges, and 
b 
Pega. 


Ani 
an 
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from a certain term onward, the series Za, is absolutely convergent; if Eb, 
diverges and i 
b 
s lnt 
bn 


An+1 


An 


from a certain term onwards, the series La, is not absolutely convergent. 


9,93 


(e 0) 
: : 1 
*12. By comparison with > za > Prove “Raabe’s” test: 
v=1 


The series X {a,| converges or diverges according as 


lan] 
s (E ! 


is greater than | + « or less than 1 — «e for every sufficiently large » and for 
some e > 0 independent of n. 


13. By comparison with & , prove the following test: 


»(log »)# 
The series ÈX |a,| converges or diverges according as 


n log n( |an! ] — 2) 


[an4a] n 


is greater than 1 + e or less than | — e for every sufficiently large n. 
14. Prove Gauss’s test: 
HH Ra 


CA 
If n a 
[an1] : n j nite 


where |R,,| is bounded and «e > 0 is independent of n, the È |a,| converges 
if x > 1, diverges if # < 1. 

15. Test the following “hypergeometric” series for convergence or diver- 
gence: 


a alx +1) ala +1Xx + 2) a 


a+ G4) * AG +DE FD 

a B a(x +1): B(B + 1) 

aa rh eae ee eras 
1:2-3-yy + DY +2) 


(a) 


SECTION 7.4, page 529 


1. The sequence f(r), n = 1,2,..., is defined in the interval 0 < x < 1 
by the equations 


fo) = |, fra) = V fys(2). 
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(a) Prove that in the interval 0 < x <1 the sequence converges to a 
continuous limit. 
*(b) Prove that the convergence is uniform. 


*2. Let fo(z) be continuous in the interval 0 < x <a. The sequence of 
functions f„(x) is defined by 


fe) =| ald dt n=1,2,.... 
0 


Prove that in any fixed interval 0 < x < a the sequence converges uniformly 
to 0. 


*3. Let f(z), n = 1,2,..., be a sequence of functions with continuous 
derivatives in the interval a < x < b. Prove that if f(x) converges at each 
point of the interval and the inequality | f,’(z)| < M (where M is a constant) 
is satisfied for all values of n and x, then the convergence is uniform. 

— 1 
4. (a) Show that the series > za converges uniformly for x > 1 + « with 
v=] 
€ > 0 any fixed number. 
, l f 

(b) Show that the derived series — > -ET converges uniformly for 
x 21 + «with e a fixed positive number. 

*5, Show that the series Ý ao , « >0, converges uniformly for 


e Sx K 2m — e with « any small positive value. 


r— 1 n e 
red | Sle el SGT 


converges uniformly for e <x < N when «e, N are fixed positive numbers. 


6. The series 


7. Find the regions in which the following series are convergent: 


(a) a, @)So,a>1. 
(rz log v 
OÈ Ora 


x 


v 
— xy’ 


ODS.a<1. (Nd; 


ck a ; 
*8. Prove that if the Dirichlet series > 5 converges for x = xp, it converges 


for any x > xo; if it diverges for x = xg, it diverges for any x < x. Thus there 
is an “‘abscissa of convergence” such that for any greater value of x the 
series converges, and for any smaller value of x the series diverges. 


8” 


# ; ; a, lo 
% If> ~ converges for x = xp, the derived series -> —=>— converges 


for any x > To. 
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SECTION 7.5, page 540 

1. If the interval of convergence of the power series La,,x" is || < p, and 
that of Zb,” is |r| < p’, where p < p’, what is the interval of convergence 
of X(a, + b,)x”? 

2. If a, > 0 and Za, converges, then 


lim > aye’ = > a, 


2-+1—0 
3. If a, > 0 and Za, diverges, 
lim $ a” = œ. 
x-»1—0 
*4. Prove Abel’s theorem: 
If Za, X” converges, then La," converges uniformly for 0 < x < X. 


*5. If 2a,X converges, then lim La,xv = La,X’. 
xr-+X—0 
*6. By multiplication of power series prove that 


(a) eve” = e+”, (b) sin 2x = 2 sin x cos x. 
7. Using the binomial series, calculate 2 to four decimal places. 


8. Let a, be any sequence of real numbers, and S the set of all limit points 
of the a,. We denote the least upper bound p of S by p = lima,. Show 


that the power series > c,x" converges for |x| < p and diverges for |x| > p, 
n=0 
where 
l 
po a S 
lim ¥ |en] 

APPENDIX, page 555 

1. Prove that the power series for v (1 — x) still converges when x = 1. 

2. Prove that for every positive «e there is a polynomial in x which represents 
V(1 — x) in the interval 0 < x < 1 with an error less than «. 


3. By setting x = 1 — ¢? in Problem 2, prove that for every positive e there 
is a polynomial in ¢ which represents |f| in the interval —1 <¢ < 1 with an 
error less than e. 

4. (a) Prove that if f(x) is continuous for a S x < b, then for every « > 0 
there exists a polygonal function ¢(2) (that is, a continuous function whose 
graph consists of a finite number of rectilinear segments meeting at corners) 
such that | f(x) — y(x)| < « for every x in the interval. 

(b) Prove that every polygonal function ¢(x) can be represented by a 
sum g(x) =a + bx + Xe; |x — x|, where the x,’s are the abscissae of 
the corners. 


5. WEIERSTRASS’ APPROXIMATION THEOREM. Prove on the basis of the last 
statement that if f(x) is continuous in a < x < b, then for every positive e 
there exists a polynomial P(x) such that | f(x) — P(x)| < «e for all values of 
xin the intervala < x < b. 

Hint: Approximate f(x) by linear combinations of the form (x — 2,) + 
|x =e &rl. 
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6. Prove that the following infinite products converge: 


(a) TT a +G); 


n=1 
—1 
(6) Is w+’ 
(c) TI (i — ai) 
n=1 : 
if |z| < 1. 


%0 
7. Prove by the methods of the text that Hy (i+ 1) diverges. 


8. Prove the identity 


JT dG +) = 
v=1 


— wv 

for |x| < 1. 

* 9. Consider all the natura! numbers which represented in the decimal 
system have no 9 among their digits. Prove that the sum of the reciprocals 
of these numbers converge. 


10. (a) Prove that for s > 1, 


a atin: tea Se he oe coe — 1-8 
1 PtT p + (1 — 21°) {(s), 
where {(s) is the Zeta function defined on p. 560. 
(b) Use this identity to show that lim (s — 1)¢s) = 1. 


3—l + 
11. Integral test for convergence 


(a) Let f(x) be positive and decreasing for x > 1. Prove that the improper 


integral is f(x) dz and the infinite series 5 f(k) either both converge or 
k=1 
both diverge. 
(b) Prove that in either case the limit 


lim ( | “Fais - Sf) 


n— ®© 


exists. 
(c) Apply this test to prove that the series 


E 
2 nlog%n 


converges for « > 1 and diverges for « < 1. 


8 


Trigonometric Series 


The functions represented by power series, or as Lagrange called 
them, the “analytic functions,” play indeed a central role in analysis. 
But the class of analytic functions is too restricted in many instances. 
It was therefore an event of major importance for all of mathematics 
and for a great variety of applications when Fourier in his “Théorie 
analytique de la chaleur”! observed and illustrated by many examples 
the fact that convergent trigonometric series of the form 


(1) fae ° + ¥ (a, cos vx + b, sin vz) 
v=] 


with constant coefficients a,, b, are capable of representing a wide class 
of “arbitrary” functions f(x), a class which includes essentially every 
function of specific interest, whether defined geometrically by mecha- 
nical means, or in any other way: even functions possessing jump 
discontinuities, or obeying different laws of formation in different 
intervals, can thus be expressed. 

Soon after Fourier’s dramatic discovery the “Fourier series’ were 
recognized not only as a most powerful tool for physics and mechanics, 
but just as much as a fruitful source of many beautiful purely mathe- 
matical results. Cauchy, and especially Dirichlet, in the years between 
1820 and 1830, provided a solid basis for Fourier’s somewhat heuristic 
and incomplete reasoning, making the subject as accessible as it 
is important. 


1 See the translation: The Analytical Theory of Heat, by Joseph Fourier, republished, 
Dover Publications, 1955. 
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In spite of the “arbitrariness” of the functions expressible by trigono- 
metrical series they are inherently subjected to the condition of perio- 
dicity with the period 27, since each term of the series has this period. 
But, as we shall see, this restriction is inessential as soon as we consider 
a function merely in a finite interval from which we can easily extend 
it as a periodic function. 

This chapter provides an elementary introduction to the theory of 
Fourier series, leaving aside more advanced refinements. 

After some preliminary discussion of periodic functions we shall 
prove the main theorem establishing the validity of the trigonometric 
expansion for a wide class of functions. 

In the subsequent sections we shall discuss somewhat more advanced 
supplementary topics such as uniform and absolute convergence of the 
Fourier series and polynomial approximation of arbitrary continuous 
functions. In the Appendix we shall discuss the theory of Bernoulli’s 
polynomials and their applications. 


8.1. Periodic Functions 


a. General Remarks. Periodic Extension 
of a Function 


The functions sin nx and cos nz are periodic functions of x with the 
common period 27; thus any finite or convergent infinite sum of the 
type (1) is also periodic with period 27. We now make some general 
observations concerning periodic functions, amplifying those of Chap- 
ter 4. p. 336. 

Periodicity of a function f(x) with the period T is expressed by the 
equation 


(2a) f= + T) = f(x), 


valid for all values of x. Having the period T implies that f(x) also has 
the periods +T, +2T,..., +mT,..., and 


(2b) f(@ +mT)= f(x) 


for all integers m. 


1 In representing periodic functions it is often convenient to think of the independent 
variable x as a point on the circumference of a circle instead of on a straight line. 
For a function f(x) with the period 27, we consider the angle x at the center of a 
circle of unit radius, included between an arbitrary initial radius and the radius to a 
variable point on the circumference; then the periodicity of f(x) means that to each 
point on the circumference there corresponds just one value of the function, although 
the angle = itself is determined only within multiples of 27. 
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In special cases f(x) may also happen to have a shorter period. For 
example, the function sin(47rz/T) has the period T as well as the smaller 
period 7/2. 

As we saw already in Chapter 4, p. 337, a function f(x) defined in a 
closed interval a < x < b, can be extended as a periodic function with 
period T = b — a for all values of x by defining the function in succes- 
sive adjacent intervals of length T outside the original interval 
a < x < b by the periodicity relation 


(2c) f(x + nT) = f(x), n=+1,42,.... 


The extended function is neither defined uniquely nor necessarily 
continuously at the end points xv = a + nT =b + (n — 1)T of our 
intervals of length T. We must admit functions f(x) with jump dis- 
continuities at points x = $, which are continuous on either side of & 
but not necessarily defined or continuous at the point £ itself. 

Then the following notations and definition of f(&) will be useful 
throughout this chapter: we denote the right-hand limit and the left- 
hand limit of f(x) at x = & by 


(3a) f(E +0)= umg + e), 
(35) f(é — 0) = limy (E =<); 


it is convenient to assign by definition, to f as its value at the point of 
discontinuity ¢ itself the mean value 


(4) SE) = HAE + 9) + f(E o) 
disregarding whatever value /(€) may have had originally. 

With this convention there is no restriction on extending our 
original function from a closed interval a < x < b periodically to all 
values of x even in cases where f(a) # f(b). We need pay attention only 
to the values of f(x) at the jump discontinuities, arising in particular if 
the originally defined values of f(a) and f(b) do not coincide; to define 
the periodic extension we have to use the mean value 3[ f(a) + f(b)] in 
place of the values f(a) and f(b). 


b. Integrals Over a Period 


The graph of a periodic function f(z) clearly has the same shape in 
any two consecutive intervals corresponding to a period. This implies 
the important fact that for a periodic function f(x) of period T and for 
arbitrary a 


(5) [Ve az =| paw 
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or in words: the integral of a periodic function over a period interval 
of length T always has the same value no matter where the interval 
lies. 

To prove this fact we need only notice that by virtue of the equation 
f(E — T) = f(E) the substitution x = ¢ — T yields, for any «, 8, 


IEG dx =|" 1 dé =f" fæ dz. 


y 
|. 
KORR 
Li EE 
O YO 0.9.6. 0.6. 6.0,6 6.0.95 
a-a a x 
Figure 8.1 To illustrate the integral over a whole period. 

In particular, for « = —a and f = 0 

0 T 

| f(x) dx -| f(x) dx 

—a T~a 

and hence 


T—a 0 T-a 
Jade = f(x)dx + , f(x)dx 


—a 


T T—a 
=| sædet] Sds 


T—a 


T 
=|" fŒ de, 
0 
as stated. Recalling the geometrical meaning of the integral, the state- 
ment is made obvious by Fig. 8.1. 
c. Harmonic Vibrations 


The simplest periodic functions from which we shall construct the 
most general ones are the functions a sin wx and a cos wz, or more 
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generally asin w(x — ¢) and a cos w(x — &), where a (> 0), œ (> 0), 
and é are constants. These functions represent “sinusoidal vibrations” 
or simple harmonic vibrations (or oscillations).! The period of vibra- 
tion is T = 27/w. The number w is called the circular or angular 
frequency of the vibrations?; since 1/T = w/2m is the number of 
vibrations in unit time, or the frequency, w is the number of vibrations 
in the time 27. The number a is called the amplitude of the vibration; 
it represents the maximum value of the function asin w(x — &) or 


y 


Figure 8.2 Sinusoidal vibrations. 


acos w(x — &), since both sine and cosine have the maximum value 1. 
The number w(x — &) is called the phase and the number wé is called 
the phase displacement or phase shift. 

We obtain the functions a sin w(x — &) graphically by stretching 
the sine curve in the ratios 1 : w along the z-axis and a : 1 along the 
y-axis, and then translating the curve a distance & in the positive 
direction along the z-axis (cf. Fig. 8.2). 

By the addition formulas for the trigonometric functions we can also 
express harmonic vibrations by «cos wx + $ sin wx and respectively 
p cos wx — asin wx where « = —a sin w and B = acos w. Con- 
versely, every function of the form « cos wx + B sin wx represents a 


1 Either of these formulas taken alone (for all values of a and £) represents the set of 
all sinusoidal vibrations; the two formulas are equivalent, since a sin w(x — $) = 


acoso[z—(e+ 7) | 


2 Notice that we distinguish between the frequency and the circular frequency. 
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sinusoidal vibration a sin w(x — &) with the amplitude a = Va? + B? 
and the phase displacement wé given by the equations « = —a sin wé, 
B = acos w$. Using the expression « cos wx + f sin wx we immediately 
can write the sum of two or more such functions with the same circular 
frequency w as another vibration with circular frequency w. 

As seen earlier, periodic functions arise when we wish to represent 
closed curves parametrically. Naturally, they can be used to represent 
phenomena induced by circular motion, say a process repeated peri- 
odically in tune with a flywheel; moreover they are associated with all 
phenomena of vibration. 


8.2 Superposition of Harmonic Vibrations 
a. Harmonics. Trigonometric Polynomials 


Although many vibrations are purely sinusoidal (cf. p. 405), most 
periodic motions have a more complicated character, being obtained 
by “superposition” of several sinusoidal vibrations. Mathematically, 
the motion of a point on a line with the coordinate x as a function of 
the time may be given by a function that is the sum of a number of pure 
periodic functions of the above type. The harmonic components of the 
function are then superimposed (that is, their ordinates are added). 
In this superposition we assume that the circular frequencies (and, of 
course, the periods) of the superposed vibrations are all different, for 
the superposition of two sinusoidal vibrations with the same circular 
frequency yields another sinusoidal vibration with the same circular 
frequency as shown above. 

For the superposition of two sinusoidal vibrations with the different 
circular frequencies w, and we, there are two fundamentally distinct 
possibilities, depending on whether œ,/w is rational or not, or, as we 
said, whether the frequencies are commensurable or incommensurable. 

As an example of the first case we assume that the second circular 
frequency is twice that of the first: w, = 2w,. The period of the second 
vibration is then half that of the first, 27/2w, = T, = 7,/2, and so it 
has not only the period T, but also the doubled period T, since the 
function repeats itself after this double period; the function formed by 
superposition must likewise have the period 7,. The second vibration, 
with twice the circular frequency and half the period of the first, is 
called a first harmonic of the first vibration (the fundamental). 

Corresponding statements are true if we introduce another vibration 
with circular frequency ws = 3w,. Here again the function sin 3w,2 
necessarily repeats itself with the period 27/w, = T,. Such a vibration 
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is called a second harmonic of the given vibration. Similarly, we can 
consider third, fourth,..., (n — 1)th harmonics with the circular 
frequencies wy = 4w,, w5 = 5w,,..., @, = N@,, and, moreover, with 
any phase displacements we wish. Every such harmonic necessarily 
repeats itself after the period 7, = 27/mw,, and consequently every 
function obtained by superposing a number of vibrations, each of which 
is a harmonic of a given fundamental circular frequency œw, is itself a 
periodic function with the period 27/m, = T,. By superposing vibra- 
tions with circular frequencies ranging from that of the fundamental to 
that of the (n — 1)th harmonic we obtain a periodic function in the 
form of a trigonometric polynomial 


(6) S,(2) = S +> (a, cos vwx + b, sin vwx). 
v=1 


(The constant a,/2 which does not affect the periodicity is affixed for 
later convenience.) Since this function contains 2n + 1 arbitrary con- 
stants a,, b,, we are able to generate curves which may not at all 
resemble the original sine curves. Figures 8.3 to 8.5 are graphical 
illustrations. 

The term “harmonic” alludes to acoustics,! where a fundamental 
vibration with circular frequency w corresponds to a tone of a certain 
pitch, and the first, second, third, etc., harmonics correspond to the 
sequence of harmonics of the fundamental, that is, to the octave plus 
fifth, double octave, etc. 

In general, for the superposition of vibrations in which the circular 
frequencies have rational ratios, these circular frequencies can all 
be represented as integral multiples of a common fundamental 
frequency. 

The superposition of two vibrations having incommensurable circular 
frequencies w, and w,, however, represents a different phenomenon. 
Here the superposition of sinusoidal vibrations is no longer periodic. 
Without going into a detailed discussion, we remark that such functions 
have an “approximately periodic” character or, as we say, are almost 
periodic. 


*b. Beats 


A final remark on the superposition of sinusoidal vibrations con- 
cerns the phenomenon of so-called beats. If we superpose two vibra- 
tions, each of unit amplitude but having different circular frequencies 


1 Jn acoustics the term overtone is also used. 
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w, and w, and if for the sake of simplicity we take the same value of $ 
(see p. 575) for both (the generalization to arbitrary phase is left to the 
reader), then we are concerned with the function 


y = SIN Wx + SIN Wor (wi > m, > 0). 
By a well-known trigonometrical formula we have 
y = 2 cos [$(w, — w)r] sin [$(@, + wx]. 
This equation represents a phenomenon which we describe as follows: 


we have a vibration with the circular frequency 4(, + wə) and the 


y 
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Figure 8.6 Beats. 


period 47/(w, + ©). This vibration does not have a constant ampli- 
tude but a varying “amplitude” given by the expression 


2 cos [4(w, — @2)x] 


which varies with a longer period 47/(w, — œ). This description is 
particularly useful when the two circular frequencies ©, and w, are 
relatively large, whereas their difference (w; — z) is comparatively 
small. Then the amplitude 2 cos [}(w, — w,)x] of the vibration with 
period 4r/(w; + ©) varies only slowly compared with the period of 
vibration, and this change of amplitude repeats itself periodically with 
the long period 47/(w, — 2). These rhythmic changes of amplitude 
are called beats. Everyone is acquainted with this phenomenon in 
acoustics and electronics. In radio transmission the circular frequencies 
w, and w; are, as a rule, far above those which the ear can detect, 
whereas the difference w, — wg falls in the range of audible notes. The 
beats then cause an audible tone, whereas the original vibrations re- 
main imperceptible to the ear. 
An example of beats is illustrated graphically in Fig. 8. 6. 
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8.3 Complex Notation 
a. General Remarks 


Operation with trigonometric functions is often simplified by using 
complex numbers according to Euler’s relation 


cos 0 + isin 0 = e” 


or 

(74) cos 8 = He” + e~”) 

(7b) ce es (e® — ei) 
2i 


(Compare Chapter 7, p. 551.) Accordingly, we can express sinusoidal 
vibrations in terms of the complex quantities e*®*, e~*®*, or aet?'?—5), 
ae~**(®~5) respectively, where a, œw, and wé are amplitude, circular 
frequency, and phase displacement. Ultimately of course, real vibra- 
tions are obtained from the complex expression, simply by separating 
real and imaginary parts. 

One of the conveniences of the complex notation is the fact that the 
derivatives with respect to the time x are obtained by differentiating the 
complex exponential function as if i were a real constant; the formula 


LA a[cos w(x — £) + isin w(x — §)] 
dx 
= aw[—sin w(x — &) + i cos a(x — &)] 


= iaw[cos w(x — &) + isin w(x — §)] 


that follows from the formulas for the derivatives of the sine and 
cosine functions can be written in the concise form 


d ioi ae s 
(8) — aet," H — iawe?™ 5) 


dx 


The integral of a complex-valued function y(x), say y(x) = p(x) + iq(x), 
is naturally defined by 


þe dx = Í p(x) dx + i fac) dx. 
Accordingly, for n # 0 


Jew dx = 


cos nz dx + i [sin nx daz 


i 1 , 
sin ng — —cosnxz = — e”, 
n in 


3 ie —_—_ 
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In particular, for any integer n we have 


"ine 1. _ JO forn #0 
ix de= |), for n = 0. 


=F 
More generally, if we remember that ei"? e~*™* = e("-™ =, we have for 
any integers m, n 


(9) f ei MZ py—ime dr = : forn Æ m 


27 forn =m. 


7 


These relations are merely concise expressions of the orthogonality 
relations between trigonometric functions (see p. 274). 


* b. Application to Alternating Currents 


We insert an illustration of these ideas by an important example, denoting 
the independent variable, the time, by ¢ instead of x. 

We consider an electric circuit with resistance R and inductance L, on 
which an external electromotive force (voltage) E is impressed. In direct 
current, the voltage E is constant, and the current J is given by Ohm’s law, 
E = RI. For an alternating current however, E, and consequently J, is a 
function of the time t, and Ohm’s law takes the generalized form (cf. p. 635) 


(10) E-L 1 RI 

dt ` 
We consider the external electromotive forces E which are sinusoidal with 
circular frequency w, given by ecoswf or «sin wt and combine both 


possibilities formally in the complex form 
E = eet = ecos wt + iesin wt, 


where « represents the amplitude. Often it is useful to admit complex values 
also for the amplitude 
€ =|el e~"; 
then 
E = |e| ett-”) = |e| {cos (wt — n) + isin (wt — n). 


We may operate with this “complex voltage” E and the corresponding 
complex current 7 as if i were a real parameter. Then the significance of the 
complex relation between the complex quantities £ and J is that the current 
corresponding to an electromotive force e cos wt is the real part of J, whereas 
the current corresponding to an electromotive force e sin wt is the imaginary 
part of J. The complex current is given by an expression of the form 


I = ae’? = a(cos wt + isin wt) 
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which is also sinusoidal with circular frequency w. The derivative of J is 
then given formally by 


dI 
— = j wett 


dt 


= aw(—sin wt + i cos wt) = iwl. 


Substituting these quantities in the generalized form of Ohm's law (Eq. 10) 
and dividing by the factor e'®*, we obtain the equation 
e — aLiw = Ra, 


or 
€ 


eE R+ioL’ 
as well as 
E=(R + iol) = WI. 


We may regard this last equation as Ohm’s law for alternating currents in 
complex form if we call the quantity 
W=R+ioL 


the complex resistance of the circuit. Ohm’s law is then the same as for 
direct current: the current is equal to the voltage divided by the resistance. 
Writing the complex resistance W with w = |W] in the form 


W = we = w cos 6 + iwsin ð, 


where 
St a Eo as wL 
IW] =w =V(R + Lo,  tanô = ae 
we obtain 
I a T E 
=—e =F 


According to this formula the current has the same period (and circular 
frequency) as the voltage; the amplitude « of the current is related to the 
amplitude e of the electromotive force for real e by 

€ 


c= à 


w 


and, in addition, there is a difference of phase between the current and 
the voltage. The current reaches its maximum, not at the same time as 
the voltage but at a time 6/ later, and the same is, of course, true for the 
minimum. In electrical engineering the quantity w = VR? + L*w? is fre- 
quently called the impedance or alternating current resistance of the circuit 
for the circular frequency w; the phase displacement, usually stated in 
degrees, is sometimes called the /ag. 
If the amplitude « is complex in the form 


e = len, 
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then nothing essential is changed in the form of Ohm’s law, except » is an 
additional phase shift and we have 


E= Jel etlwt-n) | 


lel 


I E iwts—ilô+ny) 

= — = VO n 

W |W 

c. Complex Notation for Trigonometrical Polynomials 


A compound vibration of the type 
(11) S(t) = 4a) + D(a, cos vz + b, sin vz) 
v=1 
(for brevity we have taken w = 1) can be reduced to complex form by 
substituting 
cos vg = (e +e") and sinvx = —hi(e’’” — e*”*), 
This expression then assumes the simpler form 


(12) S(t) = > ae, 


v=- 71 


where the complex numbers «, are related to the real numbers ap, 
a,, and b, by the equations 


Ky = z(a, m ib,), 
a_, = h(a, + ib,), for v= 1,2,..., A, and 


Ky = 4p. 


(13a) 


Solving these relations for the a, and b,, we find that 


d, =Q, F a, 


b, = i(a, — æ). 


(13b) 


(The case v = 0 is included.) 
Conversely, we may regard any arbitrary expression of the form 


as a function representing the superposition of vibrations written in 
complex form. The result of this superposition is real if and only if 
a, + «_,isreal and «, — a_, is pure imaginary; that is, if x, and «_, are 
conjugate complex numbers. 
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d. A Trigonometric Formula 


As an application of the complex notation we prove the following 
identity: 
o,(«) = 4 + cosa + cos 2a + +--+ cos na 
(14) _ sin(n + 4)a 
~-2Qsin fa 


which is needed later on in this chapter. The formula makes sense only 
when sin $« + 0, that is, when « does not have one of the values 
0, +27, +47,... . However, once the formula has been established 
for sin $a Æ 0, we conclude that the expression [sin (n + 4$)a]/(2 sin $a) 
is a continuous function of « for all «, if we define its value at the 
exceptional points as that of o,,(«) that is, n + 4. 

For the proof we replace the cosine function by its exponential 
expression [see formula (13a) with a, = 1, b, = 0]: 


ola) = 4 > e™. 
y=—n 
On the right we have a geometric progression with the common ratio 
g=e"*=cosa+isina. Hence q can have the value 1 only if 
cos « = 1, sina = 0, that is, if « has one of the exceptional values 
0, +27, +47,.... For all other values « the ordinary formula for 
the sum yields 


yl — gqentl 
o(a) = $e d 
1—q 
1 e ina a ent lia 
7 2 1 — e” 


On multiplying the numerator and denominator by e~**? we obtain, 


as stated, 
o,(a) = at De 
2 sin 4a 


Integrating o,(t) on O<t< 7, we find the useful result that 
independently of n 


(15) N a dt -Ò (3 +$ c0s v) dt 


= hr 


since the integral of each term of the series vanishes. 
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8.4 Fourier Series ` 


a. Fourier Coefficients 


Trigonometrical polynomials 
(16) f(x) = S(x) = $a) + È (a, cos vx + b, sin vx) 
v=1 


of order n depend on the 2n + 1 coefficients a, and b,. It is remarkable 
that these “Fourier coefficients” can be expressed simply by the following 
formulas in terms of the values f(x) of the sum: 


1 |’ bie , 
(17) a, = = F f(x) cos ux dz, b, = - IEC sin ux dx. 


The proof follows if we multiply (16) by cos ux or sin ux and then 
integrate. The orthogonality relations (see p. 274) yield the expressions 
immediately, since only the terms with v = u make a nonvanishing 
contribution. 

In the complex terminology 


(16a) f(a) = S,(a) = $ ae” 


v=— 7 
a, = a4 T A_ys v = i(a, = GS), 


the corresponding expressions for the complex Fourier coefficients are 
1 7 
(17a) gE >Í f(x)" dx 
2r Jz 


as is seen also on the basis of the complex orthogonality relations 
(9), p. 583. 

Incidentally, the factor 4 in the notation for the constant term 4a, 
of (16) serves merely to make the formula (17) valid for v = 0. 

Now we are led to the main theorem on Fourier series by the natural 
question of whether, by letting the degree n of the Fourier polynomial 
(16) tend to infinity, it becomes possible to represent functions f(z) which 
are periodic with the period 27 but otherwise essentially arbitrary. 

Our main result in the next articles will indeed be: Any periodic 
function f(x) which is sectionally continuous and has sectionally con- 
tinuous derivatives of first and second order can be represented by an 
infinite “‘Fourier series” 


fœ) = a +È (a, cos vx + b, sin vx) 
v=1 
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or in complex notation 
œ 
f(z) = È ae 
v=— 0 


with coefficients given by (17) and (17a). 


b. Basic Lemma 


We first recall the definition of a piecewise or sectionally continuous 
function in an interval, as a function which is continuous except for a 
finite number of jump discontinuities in the interval. 

We further recall that the value of a periodic function f(x) is defined 
at a point of discontinuity as the mean of the limiting values from the 
two sides as agreed earlier [Eq. (4), p. 573]. 

A function f(x) is sectionally continuous and has sectionally con- 
tinuous first and second derivatives, if we can divide the whole interval 
into a finite number of subintervals, such that ff’, f" are continuous 
in each open subinterval and approach definite limits at the end points. 

The key to the proof of the main theorem will be a simple fact. 


Lemma. Ifa function k(x), and its first derivative k'(x) are sectionally 
continuous in the intervala < x < b, then the integral 


b 
K, =| k(x) sin Ax dx 
a 
tends to zero as A —> ©. 
PROOF. To prove this lemma we use integration by parts. Suppos- 
ing that k and k’ are continuous on a < x < b, we have 


b 
(18) K, -| k(x) sin Ax dx 


= H ka) cos Aa — k(b) cos Ab + [ 


a 


b 
k'(x) cos Ax dz! : 


as A increases, the right-hand side obviously tends to zero. If k(x) 
or k’(x) have jump discontinuities & in the interval, then we subdivide it 
into parts by these points &, apply our argument to the parts, and add 
the results. 


Omitting the proof, we state that the lemma actually remains true without 
any assumption about existence of the derivative k’(x), merely using the 
sectional continuity of k. The proof under these milder conditions relies on 
the fact that for 4 # 0 the function sin Az is alternately positive and negative 
in successive intervals of length 7/4. For large values of 4 the contributions 
to the integral from adjacent intervals almost cancel one another because of 
the continuity of k(x). 
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sin z T 


c. Proof of | — dz = — 
0 Z 2 


As an application of the lemma we evaluate the integral 


(19) fe Í E ie 
0 


Z 


This improper integral is defined by the relation 


I = lim Tay 
M- 0 


M sinz 
0 zZ 


where 


The convergence of the improper integral /, that is, the existence of 
the limit of Z}, for M — oo had been proved already on p. 310. The 
convergence proof was based on integration by parts and may be 
restated here. If, say, 0 < M < N, we have 


N sinz 
(20) My — Il =] f $a 
M 2 
=|=" 4 "e a 
z Im JM z 


Eeto ie ee” 
<— + +f =-=. 
M N Ju 2 M 
Since then Jy, and Jy, differ arbitrarily little if both M and N are 


sufficiently large, the existence of J = lim Jy, is assured by Cauchy’s 
M—-oa 
convergence test. Moreover, letting N tend to infinity in (20) we find 


an estimate for the rate at which the /,, approach their limit /: 
(20a) H — Iyl < 2 
m SH 


We can rewrite our expression for J in such a way that J appears as 
a limit of integrals over a fixed finite interval. Let p be an arbitrary 
positive number; for M = Ap the substitution z = Az, dz = À dx 


shows that 
AD ai Da 
sin Z sin Ax 
Lip -Í dz -Í —— dz. 
0 Z 0 x 
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Since Ap — œ for A œ and fixed positive p, we clearly have 
ya 
I = lim Í S ii 
Azam /0 x 
and more precisely from (20a) 


Sai 
r- init ge) È, 


0 x p 


Thus for any positive p the expressions 


ae 
Í sin Ax | 
0 x 


approach for Å — œ one and the same value J; moreover, the con- 
vergence 1s uniform in p as long as we restrict p to values above some 
fixed positive number P. Indeed, the difference between the integral 
and the limit / is then less than e for A > 2/Pe. 

We now apply our lemma of p. 588 to the function 


k(x) = 1 ee 

x 2sin (2/2) 
If we define (0) = 0, the function k(x) is continuous and has a con- 
tinuous first derivative for 0 < x < 27 (see p. 466). Hence our lemma 


shows that 
sin Ax{— — —————] dz 
0 x 2sin (2/2) 


tends to zero for A-» œ as long as 0 < p < 27. Moreover, by (18) the 
convergence is uniform for 0 < p< 7 since |k(x)| and |k'(x)| are 
bounded in the interval 0 < x < m. It follows from our previous 
result that for any p in the interval 0 < p < 27 


; f P sin Ax 
lim ———— 424i = I, 
avo J0 2 sin (2/2) 
and also that the convergence is uniform in p for P < p < 7, where P 
is a fixed positive number. 

Now for p= vrn and 4=n-+ 4 (where n is an integer) we have 
evaluated this integral [see formula (15), p. 586] and found that it has the 
value 7/2 independently of n. Letting À tend to infinity through values 
of the form A = n + 3, we find then for 7 the value 7/2: 


(21) Í sinz yT. 
0 2 
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We have proved moreover that 


A 
(21a) lim | URL iya 


A>ao/0 x 


NIJ 


where for a fixed positive P the convergence is uniform for P < p, 
and that 


N 
(21b) lim | oU s 
Avo Jo 2 sin (x/2) 


2 
2 ? 
where the convergence is uniform for P < p < 7. 


d. Fourier Expansion for the Function $(x) = x 


Our last result leads directly to the Fourier expansion of two re- 
lated sectionally linear periodic functions ¢(x), y(x) defined in the 


y 


Figure 8.7 The function ¢(z). 


interval —7 < x < m by 
P(x) = x 
and 
a—x forx>0 
(22) y(x) = 0 forz = 0 
—7—x forz<0. 
(See Figs. 8.7 and 8.8.) 
The first function ¢, periodically extended outside the interval 
—rn <x < +7, has jump discontinuities at the end points, whereas 


x(x) suffers a jump of 27 at x = 0. Obviously, the two functions 
periodically extended are related to each other by 


x(x) = $r — 2). 
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The Fourier expansion for y(x) follows immediately for 0 < x < r 
from formulas (14), p. 586, and (21b), p. 591, for 4 = +4 and 
p = x, by passage to the limit, n — œ. We find the Fourier series 
(23a) y(t) = 2(sin x + nme 4 are Fa ). 

The same holds then also for —7z < x < 0, since both sides are odd 
functions of x. The series is uniformly convergent for e < |z| < r, 
with any arbitrarily small, positive value of e. Atx = 0 all terms of the 


y 


-2r 0 2T 


Figure 8.8 The function y(x). 


series are zero, hence also the sum, in agreement with the definition of 
y(0). Since both sides have period 27 the identity (23a) holds then for 
all x. 

That the coefficients of the expansion are indeed the Fourier 
coefficients defined by formula (17), p. 587, is confirmed easily. 

The Fourier expansion for ¢(x) is now obtained directly from 


a) = x(n — 2): 
(23b) (z) = 25- 1) 
= 2(sin x — $ sin 2x + 4 sin 3z — + +>). 


v+1 SIN Vx 


Here the convergence is uniform as soon as the point x is bounded away 
from the discontinuity points x = +7 by the condition |z| < 7 — e. 
For x = 7/2 we obtain again Leibnitz’ series 


Fe ER] 
It should be mentioned that the two series for y and ¢ do not con- 
verge absolutely; indeed the absolute values for x = 7/2 form the 
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divergent series 


= 1 
2 : 
> near 


Formula (235) is remarkable as an example of an infinite series of 
continuous functions which converges for all x but has as sum a dis- 
continuous function, namely, the piecewise linear function (x). Each 
partial sum of the series is continuous, since the sum of any finite 
number of continuous functions must again be continuous. Because 
a uniformly convergent infinite series of continuous functions has a 
continuous sum, the Fourier series cannot converge uniformly in a 
neighborhood of a point x at which ¢ is discontinuous, that is, for 
x= +r, +37,.... Figure 8.4, p. 579 illustrates how the successive 
partial sums which are trigonometric polynomials and continuous 
functions approximate the sectionally linear function 4ġ(x) uniformly 
in an interval of continuity, but that near the end point the functions 
change more and more rapidly. 


e. The Main Theorem on Fourier Expansion 


The Fourier Coefficients. After the preceding preparations, the 
possibility of expanding a large class of functions can be easily ascer- 
tained. The form of such an expansion for a function f(x) with the 
period 27 is 


(24a) f(x) = ła + È (a, cos vx + b, sin vz), 
v=] 

or in complex notation 

(24b) f(x) = 2 ae. 


We first assume that we have uniformly convergent expansions (24a) or 
(24b) for the function f(z). We can then determine the coefficients 
a, b,, respectively, a, in these expansions by multiplying by cos pz, 
sin ux, respectively by e~*“*, and integrating from —7 to m, using the 
orthogonality relations (see pp. 274 and 583) 


0 ifu Ay 


Í sin vz sin uz dz = Í cos vz cos wat de = | if u = v 0, 


=i ~r 


(25) f sin yx cos ux dx = Q, 


=p 


ET O 0 ifuxzy 
iva —ing = 
Í E de= |, ifu = v. 


n 
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Writing ¢ for the variable of integration, we at once obtain the formulas 
1 |’ 1 {” 

(26a) a, = al S(t) cos ut dt, b, =- Í S(t) sin ut dt 
T v-r T yr 


for u =0,1,2,... , and 
(26b) x, = = f(e" dt 


‘ TT 
for u = 0,41, +2,.... 

Thus, if f(x) can be expanded at all into a uniformly convergent 
series (24a) or (24b), then the coefficients can only have the values 
determined by the formulas (26a) and (265). But even without a justi- 
fication for this very tentative procedure, these formulas (26a) or (26b) 
define sequences of numbers a,, b,, and a, called the Fourier coeffi- 
cients for every function f(x) which is continuous or piecewise con- 
tinuous in the interval -7 < x < v. 

For a given function f(x), we form with the coefficients thus defined 
by (26a, b) the Fourier partial sums 


S(x) = 4a) + È (a, cos vx + b, sin vz) 
y=1 


OT 


S (2) = ¥ ae”. 


y=—N 


Our task is to prove that these Fourier sums actually converge for 
n — œ and that the limit is the function f(z). 
We now state the 


MAIN THEOREM. The Fourier series 


(27a) tay + D(a, cos vx + b, sin vz) 
or = 
(27b) > ae" 


formed with the Fourier coefficients (26a) or (26b) converges to the value 
J (x) for any sectionally continuous function f(x) of period 27, which has 
sectionally continuous derivatives of first and second order... Here the 


1 We mention again that this theorem can be proved for much more general classes 
of functions (see, for example, Section 8.6). The result formulated here, however, 
amply suffices for most applications. 
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value of f(x) at a point of discontinuity must be defined by 
(27c) f(z) = y + 0) + f(z — 0). 


PROOF.’ For the proof we substitute in the nth “Fourier polynomial” 


S(x) = $a) + È (a, cos vz + b, sin vx) 
v=1 


the integral expressions (26a) for the coefficients and then interchange 
the order of integration and summation; we obtain 


S,(2) = 1f soji + > (cos »t cos va + sin vt sin »z)| dt, 
—7 v=] 


or, using the addition theorem for the cosine, 


S,(x) = 1f soli + > cos v(t — 2) dt 
MT v-r v=1 
By the summation formula (14) of p. 586 therefore, 
2r J—2 sin $(t — x) 
Finally, setting 7 = t — x and recalling that periodicity allows us to 


shift the interval of integration by the quantity 2 (see p. 574), we obtain 


(28a) s) => | fla) TW hr, 
2T J-r sin 47 
where x is, of course, fixed. 
We now prove that S,,(x) tends for n + œ to f(x); or 


(29) lim S,(2) = lim >f pe + 1 G+ Diy Dia 


Because f(x) = 3[f(x + 0) + f(x — 0)] for all x, we have [see 
formula (15), p. 586] 


= f(z). 


” [f(z +) — f(x + 9)] 
2 sin dt 


e+ —s@- 0), 
a ene 


S,(2) — f(x) = sin (n + })tdt 


1 We give here only the proof for expansion of fin a series (27a). Series (276) follows 
then by the substitutions given in Eq. (13b), p. 585. 
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If we can show now that the functions [f(a + t) — f(z + 0)]/(2 sin $t) 
and [f(x + t) — f(x — 0)]/(2 sin 4t) of the variable ¢ are sectionally 
continuous together with their first derivatives in the intervals 0 < t < 7a 
and —7 < t < 0 respectively, then by our basic lemma (p. 588) both 
integrals on the right-hand side tend to zero for n — œ and formula 
(29) follows. 

Thus the main theorem is proved if we can show that for a fixed x the 
function of ¢ defined by 


for O<t 
2 sin $t ee 
$(t) Pa Dae) for —-17<1t<0 
2 sin $t 


is sectionally continuous and has a sectionally continuous first 
derivative, provided f, f’, f” are sectionally continuous. 

To ascertain that these conditions are satisfied for the quotient (t) 
we first observe that the denominator vanishes only for t = 0, and that 
therefore ¢ and its first derivative are sectionally continuous except 
possibly near ¢ = 0. Only at the singular point ¢ = 0 could a loss of 
differentiability occur. All we have to do, therefore, is to show that 
f(t) and its derivative ¢'(t) approach limits if ¢ tends to zero from 
positive or negative values respectively. We shall indeed show that 
these limits exist, and that they have the values 


P+0)=f'(e¢ +0), g(—0) = f(x — 0) 


respectively 


$'(+0)=4f'(a@ +0), 4(—0) = F/"(@ — 0). 


For the proof we introduce the function g(t) by (t) = g(‘)A(t), 
where the factor h(t) is defined by 


t 


= ——— for t#0, h(0) = 1. 
2 sin (t/2) wes ©) 


h(t) 


We have (see Chapter 5, p. 465) in A(t) a continuous function with a 
continuous derivative in the whole interval —7 < t < 7 with A(0) = 1, 
h'(0) = 0; therefore in the limit for t—0O the values of g(t) and 
p(t) as well as those of g'(t) and ¢'(t) = gh’ + g'h coincide. 

Now in the interval 0 < t < m (see Chapter 5, p. 464 for general 
remarks about indeterminate expressions) by the mean value theorem 
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of calculus 


g(t) sieto Seto = f'(x + &) 


with an intermediate value & between O and t; hence for t, and thus 
also &, tending to zero 


g(+0) = f'e + 0), 
For the derivative we obtain 


vn) =L OCHD +f +0- Sfat 

g(t) = Aan ee 

again an expression where numerator and denominator both tend to 
zero for t+ 0 and have the derivatives tf”(x + t) and 2r, respectively. 
To determine the limit for t —> 0 we make use of the generalized mean 
value theorem (cf. p. 222) and find 


g(t) = wet = if"(c +7) 
n 


with 7 intermediate between 0 and ¢. For t— 0 we have 7 — 0, and 
hence, as said before, g'(+0) = ¢(+0) = 3f"(z + 0). 

The same reasoning applies to negative values of t£. Consequently, our 
application of the lemma is justified and the main theorem established. 

Again it may be stated that the result obtained is amply sufficient for 
all needs arising in calculus and its applications. Yet the theoretical 
interests of mathematicians, starting with the original work of 
Dirichlet, were frequently aimed at greater generality, that is, at trying 
to expand functions of a wider class.! These efforts have stimulated a 
more refined analysis of the concepts of function and integral and have 
led to the development of advanced Fourier analysis as an attractive 
specialized field, which, however, must remain outside the scope of 
this book. 


1 It might be noted that there are examples of continuous functions which are not 
expandable in a Fourier series. In addition, there exist examples of functions f(z) 
represented by a convergent trigonometrical series which, however, are not Fourier 
series having the expressions (26) as coefficients. Such examples show that for 
refined investigations a distinction between trigonometric series in general and Fou- 
rier series in particular is in order. For us moreover, they illustrate the fact that 
more restrictive conditions than that of continuity are indeed appropriate, even 
though the restrictions assumed in our main theorem and in an extension given in 
Section 8.6 are much more severe than really needed. (See for the general theory, 
Trigonometrical Series, by A. Zygmund, Chelsea Publishing Co., 1952.) 
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8.5 Examples of Fourier Series 
a. Preliminary Remarks 


We assume throughout that the period of our functions f(x) is 27. 
If f(x) is an even function (cf. p. 29), then clearly f(x) sin vx is odd 
and f(x) cos vx is even, so that 


bat! Jainor Demet 
T J—7 


and we obtain a “cosine series.” If, on the other hand, the function 
f(x) is odd, then 


1 m 
a, =- | f(x)cosrxdzx = Q, 
N A-r 
and we obtain a “‘sine series.”! 


b. Expansion of the Function ¢(x) = x? 


For the even function z?, we have upon integrating twice by parts 
4 


a, = 2 [at cos ve de = (=$, (v > 0), 
T JO y 

2 

0 3 ’ 


so that we obtain the expansion 


2 
2 T cosx cos2x  cos3x a 
Doe a 


Differentiating this series term by term and dividing by 2, we formally 
recover the series (23b), p. 592, obtained previously for ġ(x) = x. 
c. Expansion of x cos x 


(See Fig. 8.9.) For this odd function we have 


27 : 
a,=0, b, =-= | zcos «vsin vz dz. 
7 J0 


1 Consequently, if the function f(z) is initially given only in the interval 0 < x < 7, 
then we can extend it in the interval —7 < x < 0 either as an odd function or as an 
even function, and thus for the smaller interval 0 < x < ~v either a sine series or a 
cosine series is obtainable. 
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Figure 8.9 


Using the formula 


[esin we de = (~1"2, (ls 2 Ss 25) 
0 u 
we find 


2p 
b, = =- | xcos z sin vz dz 
M J0 


= al a[sin (v + 1)x + sin (v — 1)x] dx 
TT JO 


x — | 
b, = —} 
We therefore obtain the series 
(31) xcosx = —}sinx+2)> CP sin yx. 
v= 2) = 


Adding the series (23b), p. 592, found for d(x) = x yields 
(31a) 


sin 2x sin 3z sin 4x ~+--). 


x(1 + os 2) = sin z + 2( 
a ak L3 234 34:5 


When the function which is equal to z cos x in the interval —r < x < 7 
is extended periodically beyond this interval, the same discontinu- 
ities (cf. Fig. 8.7) occur as exhibited by the function ¢(x) considered 
earlier in Section 8.4d. On the other hand, the function 2(1 + cos 2), 
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periodically extended, remains continuous at the end points of the 
intervals, and in fact its derivative also remains continuous, since the 
discontinuities are eliminated by the factor 1 + cos x, which together 
with its derivative vanishes at the end points. This accounts for the 
fact that the series (31) converges uniformly for all x, as is evident by 


; ; os with 1 1 l 
comparison with the series with constant terms E + z3 + 3 Porr 
d. The Function f(x) = |x| 


For this even function b, = 0, and a, = z Í x cos vx dx; by inte- 
grating by parts we readily obtain i 


WT 1 T 1 mT 
x cos vx dz = - x sin yz | — -| sin vzdr 
0 


` V 0 yJ0 
0, if v is even and 0, 
=)—2,  itvis odd. 
L4 

Consequently, 

4 cos 3x — cos 5z 
(32) |x| = r — - [cos x + + iess 

7 3? 5? 


Putting x = 0, we obtain the remarkable formula 


z 1 1 
32a EE E EE E ere 
pea 8 oF ase 
e. A Piecewise Constant Function 
The function defined by the equations 
l}, for —7 < x < 0, 
f(x) = sgn z = 0, for x = 0, 
+1, for0<2<z, 
as indicated in Fig. 1.22, p. 32, is odd. Hence a, = 0 and 


> fr 0 if v is even, 
m= 2 fisin an iodi 


TV 
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so that the Fourier series for this function is 


(33) jost (Ey RL.) 


For x = $n, in particular, this again yields Leibnitz’s series. 


y 


Figure 8.10 


The series (33) can be formally derived from that for |x| given in (32), 
using term-by-term differentiation. 


f- The Function |sin x| 
The even function f(x) = |sin x| can be expanded in a cosine series, 


with the coefficients a, given by the following calculations: 


57a 


y 


T 
-Í sin x cos vx dx 
0 


f [sin (vy + I)a — sin (» — I)x] dx 
0 


0 if vis odd, 


j if v is even. 
y“ — | 


We thus obtain, writing 2 instead of v, 


(34) Isin z| = 
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g. Expansion of cos ux. Resolution of the Cotangent 
into Partial Fractions. The Infinite Product for 
the Sine 


The function f(x) = cos ux for —17 < x < m, where u is not an 
integer, is even; hence b, = 0, whereas 


T 
$ra, -Í COS ux cos vx dx 
0 


= 3 ‘feos (u + v)x + cos (u — v)x] dx 


= |= (u +v) , sin(u — 24 


u+y u—? 
u(— 1y 
"al sin ur 
We thus have 
2u iner- cos x cos 2x 
35) cos ux = W (— = H+ E Fe 
( ) H j 2u? w— 1? we — 2? 


This function extended periodically with period 27 from the interval 
—a <x <7 remains continuous at the points x= +7. Putting 
x = 7, dividing both sides of the equation by sin ur, and writing x 
instead of u, we obtain the equation 


2x/ 1 1 1 
36 cot r= H(t ——- o). 
1e) 7 lo tae eee oe 
This is the resolution of the cotangent into partial fractions (in analogy 
to the finite partial fraction resolutions of rational functions discussed 
in Chapter 3, p. 286), a very important formula of analysis. 
We write this series in the form 


If x lies in an interval 0 < x <q < 1, the nth term on the right is less 
in absolute value than 2/[m(n? — q*)]. Hence the series converges 
uniformly in this interval and can be integrated term by term. Multi- 
plying both sides by m and integrating, we obtain 


ad 1 sin mz : sin ra sin 7x 
7 cot mt — —]} dt = log — lim log = log 
0 wrt TX a0 Ta mE 
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on the left and 


r? „2 , n 2 
log (1 = Z) +1og (1-5) += iim Š tog (1 - 5) 


n>% v=] Y 


on the right. Thus 


: n 2 
log F = lim Slog (i = z) 
v 


TX n>æ v=1 


n x n x 
= lim log [J (i — z) = log lim [T ( — al 
v=1 


n> oO v=] y 


If we pass from the logarithm to the exponential function we have 


x? a” a 
(36a) sin 7x = al — =)(! am a ( = a ar 


We have thus obtained the famous expression for the sine as an 
infinite product.’ 
From this result, by putting z = 3, we obtain Wallis’s product 


2y w 2244 
ly = pec ae othe Gee a a 
NG Se oe 


as derived before on p. 281. 


h. Further Examples 


By brief calculations similar to the preceding, we obtain further 
examples of expansions. 

The function f(x) defined by the equation f(x) = sin ux for — m < 
x < m can be expanded in the series 


ee i in x 2 si in 3 
Gi aura 2 sin ur ( al a _ ain - a Sdi ). 
7 u— l y“ — 2° pu — 3 
Putting x = 47 and using the relation sin urm = 2 sin sum cos hum 
yields the resolution of the secant, that is, of the function 1/cos 4u7 
into partial fractions; this expansion is 


Tage kee) 
cosmx va 4a? — (2y — 1)? 


T SECTE = 
where we have written x in place of łu. 


1 This formula is particularly interesting because it exhibits directly that the function 
sin mz vanishes at the points z = 0, +1, +2,.... In this respect it corresponds to 
the factorization of a polynomial when its zeros are known. 
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Series analogous to (35) and (37) for the hyperbolic functions cosh ux 
and sinh wx (—r < x < r) are 


h — 2H Sinh I COS x cos 2x cos3z 

cosn uz = — sinn unia a 2T 29, n2 2 92 > 
T 2u +I HD uf 3 

sing  2sin2z sin ga) 

K++? w+ 2? yw” + 3? i 


die aah m| 
T 


8.6 Further Discussion of Convergence 
a. Results 


A closer examination of the Fourier coefficients a,, b, leads easily to 
the following corollaries to the main theorem of Section 8.4e, p. 593. 


(a) The Fourier series (27), p. 594, converge to f(x) for all periodic 
functions under the relaxed condition that f(x) and merely its first 
derivative f'(x) are sectionally continuous or, as we say, that the 
function is sectionally smooth. 

(b) If the periodic sectionally smooth function f(x) is continuous, the 
convergence is absolute and uniform. 

(c) If the sectionally smooth function f(x) suffers jump discon- 
tinuities, the convergence is uniform in each closed interval which does 
not contain a point of discontinuity. 


The proof of (b) depends on a simple inequality of Bessel, whereas 
for the proof of (a) and (c) the results of Section 8.4d, p. 59], will be 
used. 


b. Bessel’s Inequality 


This inequality yields bounds for the Fourier coefficients of any 
piecewise continuous not necessarily differentiable function. It states 
that 


(38) hag? + D(a? +b’) <M? 
y=] 


1 7 
where the bound M? = — | f(x)? dx is a number fixed by the function 
T r 


f(x) and depends neither on the individual Fourier coefficients a,, b, 
nor the number n. With the complex Fourier coefficients «, [see (13a)], 
p. 585, Bessel’s inequality can be immediately written in the form 


(38a) $ la? | SO de = Mt 
n NT J-r 


yo— 
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The inequality is a direct consequence of the obvious fact that 


7 n 2 
a fw — hay — >(a,cos vx + b, sin »z)| dx > 0. 
MT J—tr v=] 
We evaluate the integral by expanding the square under the integral 
sign and observing the orthogonality relations (25), p. 593, as well as the 
definitions (17), p. 587, of the Fourier coefficients: by integrating the 
individual terms we immediately obtain Bessel’s inequality in the form 
(38) stated above. 

Since the left-hand side of Bessel’s inequality increases monotonically 
with n and the upper bound M? is fixed, we can pass to the limit 
n—» œ and infer that the inequality 


(39) 2 $ la? = bag? + ¥(a,? + b?) < M? 
v=— © v=1 
is valid. The inequality (39) holds for the Fourier coefficients of a piece- 
wise continuous function f(x) even if f should not be represented by the 
series (27a) or (27b). 

Incidentally, we shall show in Section 8.7d that Bessel’s inequality 
(39) remains valid if we replace the inequality sign by that of equality. 


*c, Proof of Corollaries (a), (b), and (c) 


Assuming f(x) itself to be continuous we apply Bessel’s inequality 
to its piecewise continuous derivative g(x) = f'(x) which has the 
Fourier coefficients c, = +vb,, d, = —va,, as we find immediately 
using integration by parts (since the integrated terms cancel): 


Cc, = | f'(%) cos vx dx = +Í vf (x) sin vx dx = +vb,, 
T v-r —7 


and similarly for d,. [Here we have made use of the continuity and 
periodicity of f(x).] We have therefore 


Èra, + by’) = Dey? + d’) 
v=1 v=1 


gi Í a(x) dz = + | f(x) dz = M? 

T v—n T Jr 
This result allows us to construct for the Fourier series of f(x) a major- 
ant with constant positive terms, which according to p. 535 assures 
absolute and uniform convergence as stated in (b). Indeed, we have 
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first for the sth harmonic oscillation by the Cauchy-Schwarz inequality 
(cf. p. 15) 


la, cos vz + b, sin vz|? < (a,? + 5,?)(cos? vx + sin? vz) = a}? + b; 
then by using the inequality 
Pg S i + 9) 


for p = lfr, q =» Va, + b,?, we have for all v, 
1 2 2 
la, cos vx + b, sin va] < -vV a, + b, 
p 


1} 1 
< E + v*(a,? + B®. 


Since the sum over v of the last expression is convergent, we have con- 
structed a majorant. Therefore the Fourier series 


a0 
kay + È (a, cos vx + b, sin vx) 
y=] 
converges uniformly. It then has a sum s(x) which is a continuous 


function of x. To show that actually s(x) = f(z) we use an artifice by 
considering the integrated function 


Fas F OET 


Clearly, F(x) is continuous for —7 < x < m; moreover, F has the 
same value at x = —7 and x = 77, since 


F(z) =|" dt — na, = 0 = F(—7). 


Hence the periodic extension of F is continuous. Since also the first 
and second derivatives of F are sectionally continuous, the function F 
is represented by its Fourier series. By the same argument based on 
integration by parts as before the Fourier coefficients of F are —(1/v)b, 
and (I/v)a, for v # 0, so that 


F(x) = 44, + > as cos yz + a, sin vz) 
v=1 V 


with some constant coefficient Ay. Now the series obtained by formal 
term-by-term differentiation is already known to converge uniformly. 
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Consequently, formal term-by-term differentiation is legitimate (see 
p. 539), and we obtain the desired relation 


F(x) = f(z) — }a, = S(a, Sosa sida) 


To prove the remaining statements for f sectionally continuous and 
periodic with a sectionally continuous derivative f’ we recall that by our 
previous result they are true for the periodic function y(x) of Section 
8.4d and hence for the function y(x — &) which suffers the jump 27 at 
the point €. If now the function f(x) suffers the jumps ĝi, bzs... s Bm 


l m 
at the points ¢,,é,,...,é,,, then f*(x) = f(z) — A 2 bix — &) 


satisfies the conditions of (b) and hence possesses a uniformly conver- 
gent Fourier series, thus proving statement (a), (c) for f (x). 


d. Order of Magnitude of the Fourier Coefficients. 
Differentiation of Fourier Series 


The preceding discussions of convergence illustrate a general fact: 
The Fourier coefficients a,, b, converge more rapidly to zero as n + œ, 
when f(x) is smoother, that is, when more derivatives of the periodic 
function f(x) are continuous. Correspondingly, the Fourier series 
converges better as the functions are smoother. We state precisely: 
If the periodic function f(x) has continuous derivatives up to order k 
and a piecewise continuous derivative of order k + 1, there exists a 
bound B, depending only on f(x) and k, such that 


B 
(40) la,|, [b,| < em 


pkt 
The proof is again (see above) almost immediate if we use integration 
by parts. For brevity we write in complex notation 


a, — ib, = 2a, 


and integrate successively by parts until in the integrand the factor 
f**\(x) appears. Because of the periodicity and continuity of f(z), 
f'(x), etc., the boundary terms cancel each other and, 

27H, =| fae dx = — - f f (2) ®” dx 


V 


2] k+l jr , 
mee = (=) Tae dx. 
Y 


r 
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Hence if 4B is an upper bound for | f“*(x)|, then |a,| < $B/*+}, 
which implies the inequalities (40). 

A further remarkable result is that for k > 2 the Fourier series can 
be differentiated term by term k — 1 times and then yields the Fourier 
series for the differentiated function. For the proof we observe that 


co 


all these differentiated series have the convergent series with BÈ E 
v=1 V 


as majorant, hence converge absolutely and uniformly themselves 
(cf. the criteria of Chapter 7, p. 541). 


*8.7 Approximation by Trigonometric and Rational Polynomials 
a. General Remark on Representations of Functions 


In what manner the concept of function should be restricted by 
demanding the possibility of “explicit expressions” has been a challeng- 
ing question since the early times of calculus. Functions often are not 
given analytically, but rather by geometrical or mechanical con- 
structions or by the geometric description of their graphs, which could 
be of a different nature in different intervals. 

The discovery of Fourier series in the early nineteenth century was 
a most illuminating step towards answering the old question; it 
revealed that indeed “arbitrary” functions, certainly much less 
restricted than ‘‘analytic’” ones, can be expressed by convergent 
Fourier series. Yet even the Fourier series do not cover all continuous 
functions: as we mentioned without proof, one can define continuous 
functions for which the Fourier series, formed with the Fourier co- 
efficients, does not converge. 

It is all the more remarkable that by giving up the principle of 
infinite series in which the approximation is achieved by addition of 
higher order terms only, we can for any continuous function f(x) con- 
struct approximating trigonometric or rational polynomials P,(2) of 
order n which converge for n — œ in a closed interval uniformly to 
the given function f(x). 


b. Weierstrass Approximation Theorem 


We prove the following closely related theorems. 


(a) If f(x) is a continuous function in a closed interval /, which is 
contained in the larger interval —m < x < v, then f can in / be uni- 
formly approximated by a trigonometric polynomial of period 27 of 
sufficiently high order n. 
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(b) Any function f(x) which is continuous in a closed interval 7 can 
be uniformly approximated in / by a polynomial P(x) in x. This state- 
ment due to Weierstrass, can be supplemented (see p. 539) by the 
corollary: 

*(c) If f(x) possesses a continuous derivative in J then the approxi- 
mating polynomials can be so chosen, that the derivative polynomials 
P,, (x) approximate the derivative f'(x) uniformly. 


The proof of (a) is quite direct. We first approximate f(x) by a 
piecewise linear function whose graph is a polygon L,(x) inscribed in 
the graph of f(x) (see Fig. 8.11). Obviously, L,(x) differs from f(x) 
absolutely, by less than an arbitrarily small chosen margin «/2, if the 


Figure 8.11 Uniform approximation of continuous function by a polygon. 


vertices of the polygon are at equally spaced points z1, 7,..., 2, and 
the constant h = z, — x, is chosen sufficiently small, due to the 
uniform continuity in J of the continuous function f (cf. p. 100). 

The next step is to join, as indicated in the figure, the end points 
—7 and 7 of the larger interval by straight lines, and thus extend 
L,(z) into a piecewise linear function, again called L,(x), within the 
closed interval —7 < x < m: this function, being zero at both end 
points, can now be extended periodically and, according to Section 8.6a, 
can be expanded in a uniformly convergent Fourier series whose poly- 
nomial section S,,(x) differs from L,(x) absolutely by less than ¢/2 if 
m is sufficiently large. Now |S, — f| S |S, — Lal + ILa — f| < €, 
and (a) is proved.? 

To prove (b) we replace in each term of the finite sum S,,(z) according 
to Section 5.5b, p. 454, the trigonometric functions cos yz and sin vx by 


1 The same result holds when / is the whole interval —7 < x < +7 if we assume 
that f(r) = f(—7). Here we choose an approximating polygon L,(x) as before, 
only choosing L,(—7) = L,(7) = f(—7) = f(z). 


610 Trigonometric Series Ch. 8 


Taylor polynomials with a uniformly small remainder; hence, com- 
bining these last approximations, we construct a polynomial P(x) for 
which |P)(x) — S,,(x)| < €/2 where we must choose N large enough 
to attain the accuracy e/2. Combining, we have certainly in the smaller 
interval |P,(x) — f(x)| < «€ if m chosen such that |S,,(z) — f(x)| < €/2. 


*c. Fejers Trigonometric Approximation of Fourier 
Polynomials by Arithmetical Means 


The theorem (a) of Section 8.7b can be proved very simply by a 
direct and rather explicit construction of the approximating polynomial, 
which is provided by the following remarkable theorem of L. Fejer. 


THEOREM. If S,(z) is the nth Fourier polynomial of a periodic con- 
tinuous function f(x), then the arithmetical mean 
Solz) +> + S,(2) 
n+1 


converges uniformly to f(x) for n — œ. 


F(z) = 


The theorem guarantees convergence by averaging out whatever dis- 
turbing oscillations might occur in the ordinary Fourier approximation. 


PROOF. The proof is similar to that of the main theorem of Fourier 
sin (n + $)x 
2 sin $x 
occurring there is replaced here by the positive “Fejer kernel” 

(= ¿(n + x) 2 
S= . 
2sinłġ t n+l 

o,(«) = 4 + cos « + : +- + cos na of p. 586 can be written in the form 


expansion, but it is simpler because the oscillating kernel 


We first note that the function 


sin (n + $)a _ sin ża sin (n + $)a 
2sindn 2 sin? ġa 

_ 1 cos na — cos (n + 1)a 

=“) 1 — cos x 


O(a) = 


> 


by using the addition formulas for the cosine. We thus obtain the 
formula 


ox) + olx) + ++ ola) 1 1 — cos (n + l)a 
n+1 — n+1) 1—cosa 
tee (= [(n + Daja} 
2(n +1) \__ sin (a/2) 


= s(x). 
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Since by the definition of the o,,(«), [see (14), p. 586] 


1f o,(a«) da = 1, 


T 

it follows that 
f S,(%) da = 1. 
NT J/—t 


Now [see (28a), p. 595] 
S,(x) = z Í "f(x + Do, (t) dt 


and hence 


fle + DD+- + odt 


i 
PO a D 


= : f(x + t)s,(t) dt. 
MT v-r 
For any positive ô 


f(a) — Fa) = E [TUO — fle + DD a 
=i fe) - see + DD a 
+i SUO -Se + nis a 


+i | U@ -Se + DO de 


Now for f(x) continuous the continuity is uniform and we can choose 
a ô such that | f(x) — f(x + t)| < łe for all x in [—7z, z] and for 
|t| < 6. Moreover fis bounded, say |f| < M. Since from its definition 


1 
Isl S E D sin O 


we find using s, > 0 that 


for ô< |l <r 


e [? Qn [7 2r tae 
If (x) — F,(@)| < a [isco dt + zj Is,(t)| dt + a f |s,(t)| dt 
E 2r 21 
S 30 [isco sks m 2(n + 1) sin? (6/2) 
€ 27 
3 7 (n + 1) sin? (8/2) 


LoS) 
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Clearly, 
IŒ) z F,,(2)| <e 


for n sufficiently large, and the theorem is proved. 


*d. Approximation in the Mean and Parseval’s Relation 


The proximity of two functions g(x) and A(z) in a closed interval J in 
which they are continuous can be measured with a view to uniform 
convergence by the maximum value of |g(x) — h(x)|. Calling the maximum 
absolute value of a continuous function (x) in J its maximum norm, 
we can express the uniform convergence of a sequence of functions f, 
to a function f as n—> œ by saying that the maximum norm of the 
difference f — f, or also of f, — fm tends to zero. 

For Fourier approximations (as well as for other important mathe- 
matical theories outside the scope of this volume) it is natural to con- 
sider another measure or “norm” for the deviation between two 
functions, or what is sufficient, for the “distance” of a function (x) 
from the function identically zero. This is the “quadratic mean” or 
the “mean square norm” u = |\¢|| defined by an average value 


pate oi: 
wail sara Isl, 


where / is the length of the interval Z. It is a cruder measure than the 
maximum norm insofar as its smallness does not mean necessarily that 
the function is small everywhere. 

As an example, the norm of x” over the interval J: 0 < x < 1 has the 
value (2n + 1)~’* which can be made arbitrarily small by choosing n 
sufficiently large whereas the function x” is equal to 1 for z = I. 

If the quadratic norm || fa — f || tends to zero as n — œ, then we say 
that f, tends to f in the quadratic mean. 

The quadratic norm can be usefully denoted as “distance” because of 
the so-called triangle inequality, corresponding to that valid for numbers 
(see p. 14). This inequality ||f+gll < IfI + llgl| for two functions 
fand g follows immediately: applying the inequality pq < ¿(p° + 9°) 
TONIO 
Il gt 


and integrating over 7, we find 


with p = 


; i f(x)g(x) dz < IfI igl. 
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Now 
lf+ el? = al Lf(x) + g(x)P dx 
= | fI? + lgl? +- TECO dx 
< (IfI + lep? 
or 


If + gl <II + leli. 


With these concepts we may illuminate Bessel’s inequality of Section 
8.6b. We show first: The closest approximation in the mean to a given 
piecewise continuous function f(x) by a trigonometric polynomial 


c S l 
T, =— +c, cos vr + d, sin vx 


v=] 


of order n [with By = co/2, B, + B, = c, (8B, — B_,) =4,,] with 
freedom of the choice of the coefficients c,, d, is given by the Fourier 
polynomial 


S = "4Sa, cos vx + b, sin v2 
v=1 


=Z 5 ae, 
where a,, b,, and «, are the real and complex Fourier coefficients, deter- 
mined from f by the formulas (26), p. 594, respectively. 
The proof, written in complex notation for brevity, is easily obtained 
using the orthogonality relations (25) in the interval J = [—7, 7] for 
the functions e*’*: 


| (se) = $ Ber) ae 
a | er -2 $ BSO + i È pe) da 
= IfI? - 2.3 Bart $ BBs 


v= — 1 


(ft o> as, 2S ees hs) 


v=-7N y=—7 


= ISI? — aa, + $ (a — BIG, — By) 
= fl? — È lal + X lay — BPs 


v=—n v=—n 
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clearly, this last expression is minimized when the J, are chosen as 
the Fourier coefficients «,; that is, for «, = ,, or equivalently, 
c, =a,,d, =b, 

We can now prove Parseval’s theorem, using the approximation 
results obtained above. 

Bessel’s inequality 


jab +30 + bAT | Serde 
for n —> œ becomes pasadis equality 
jao? + S(a," +b) = - Í iow. 
for any function f(x) of period 27 which is continuous for all x. 
PROOF. By the Weierstrass approximation theorem for trigono- 


metrical polynomials we may choose a sequence of polynomials T, 
such that f(x) — 7,,(~) — 0 uniformly in x. Then also 


1 T 
= YO- Tdo asn —> oo, 
However, according to our last result, the Fourier polynomial 


S,(x) = z + > (a, cos vx+ b, sin vx) 
v=1 

yields the closest approximation to f(x) in the mean among all nth order 

Fourier polynomials, so that 


= [/ (2) — Sala) dz < ~> f) — T(x) dz 
It follows that 
lim = l f(x) — S, (x)? de = 0. 


n= LT 


On squaring the alani as on p. 613, we obtain the Parseval 
relation. 

Finally, we remark that Parseval’s relation remains valid even if 
f(x) has a number of jump discontinuities. The simple proof is omitted 
here. 


Appendix | 


*A.I.1 Stretching of the Period Interval. Fourier’s Integral Theorem 


The base interval —a < x < v for our periodic functions could be 
replaced by any interval —B <x < B. By the transformation 
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y = 7x/B this interval of length 2B is transformed into the interval 
—7 < y <7, and a function f(x) with the period 2B is transformed 
into a function ‘g(y) = f(By/7) = f(x) with period 27. The main 
theorem, written in the complex form [see formula (275), p. 594], implies 


œ 1 m nine 
g(y)= > 1f gA I dt 
v=— 0 2r —r 


and therefore by this transformation 
I Som a —ivr(s—z)/B 
(41) fa) => È =] fe? ds, 
2r vż+—œ B J-B 


where the variable of integration is replaced by s = Bt/z. 

The relation (41) is valid for every function piecewise smooth in 
—B<zxr<RB. 

We set 7/B = h, vr|B = vh = u, and write (41) in the form 


| 8 
f(~=— > h| f(s\e t's ds 
277 v=— o -B 


= L > het™v® H, 
2r v=— 0 
B 
with H, =Í e—`"vs f(s)ds. Now the formal passage to the limit for 


B — œ or Au = h —> 0 is obvious and yields 


(42) fos f et du Í e-™5F (s) ds. 

27r — 00 — %0 
This is Fourier’s integral formula which will be proved rigorously in 
Volume IJ for a large class of functions f. This formula can be written 
in a clearer symmetric form as a pair of reciprocal integral relations 
between a function f(x) and its “Fourier transform” F(u): 


(43) F(u) = = iz f(sje™ ds 


— %0 


1 00 i 
(43a) f(x) = =| F(u)" dz. 


Fourier’s integral formula (42) can be written in a form which does 
not involve the use of imaginary exponents. We only have to make use 
of the expressions 


ette—tus = eiul7—8) = cos u(s — x) — isin u(s — 2). 


616 Trigonometric Series Ch. 8 


Since sin u(s — x) is an odd function of u and cos u(s — x) an even 
function, integration with respect to u from —œ to +00 of the sine 
term makes no contribution, whereas integrating the cosine term yields 
twice the value obtained from integrating from 0 to œ. Hence 


(43b) jase Í aa Í rr Cer ee 


*A.I.2 Gibb’s Phenomenon at Points of Discontinuity 


The nature of the convergence of the Fourier series in the vicinity of 
a jump discontinuity exhibits a remarkable feature which Gibbs dis- 
covered by examining the graphs of the Fourier polynomials 


S,(2) = > + > (a, cos vx + b, sin vz). 
v=l 


As already emphasized in Chapter 7, p. 530 the nonuniform convergence 
of a convergent sequence in the vicinity of a discontinuity of the limit 
function can be visualized by the way the continuous graphs of the 
approximating functions fail to approach the discontinuous graph of 
the limit function. 

In Fourier expansions these graphs do not simply approach the 
graph of f(x) supplemented by vertical connecting segments x = é join- 
ing the two end points at the jump position &. Instead, the graphs of S, 
show waves which near ¢ exceed the ordinates f(E + 0) and f(& — 0) to 
either side by about 9% of the total height of the jump. Thus the 
approximating graphs do approximate the graph of f(x) augmented by 
a vertical line segment at x = &, not only connecting the two points on 
the graph of f(x) but overshooting this connecting line segment at 
both ends; see Figs. 8.4 and 8.5, pp. 579 and 580. 

The mathematical analysis of this situation is simple and need be 
discussed only for the discontinuity of the function y(x) of Section 8.4d 
to which all jump discontinuities were reduced on p. 607. 

The function $7(x) for positive x is [see formula (23a), p. 592], given by 


peas OLIT. 


x(x) = (7 — x) = 


v=1 


By integration of formula (14), p. 586, we find that 


Seas sins Say + Sai dt 
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Hence the remainder r,(x) = y(x) — S,(x) takes the form 


r,(2) = hr - [sete dt + p,(x), 
0 t 
where 


p(x) = h EE sin (a + prdi, 
o 2rsin $f 


Since the expression (2 sin 4¢ — t)/2t sin t is sectionally continuous 
and has a sectionally continuous first derivative the lemma on p. 588 
implies that p„(x) for n — œ tends to zero uniformly for 0 < z < vr. 
Moreover, 


* sin (n t +da sint 
o (x) = $r -Í Sop dt = 47 — ——= al 
0 t 0 H 


tends to zero for each individual positive x as n — œ (see p. 589). The 
convergence, however, is not uniform. Clearly, the derivative of 
o,(z) vanishes at the points x, = 2kr|(2n +1) for k = 1,2,3,.... 
It is easily seen that more precisely o„(x) has minima at the points 
Xis Z3, %s,... and maxima at z, 7,,.... Moreover, the values of c, 
at the minimum points form an increasing sequence. Thus c,„(x) has as 
its “absolute?” minimum for positive x the value 


0 ,(%1) = 37 -Í E dt 
o t 


li 
to 
y 
| 
—, 
it 
J 
| 
ai= 
baa S 

N 
+ 
On | i 
baa S 
da 
SN 
Qa. 
“= 


2 4 
= s mooo T 
z(3 Paa aA S 
a8 
+ — M tt 
ey eee ee, 


Q 


—0.090 : ++ 7. 


For large n the remainder r, is approximately equal to o,. Hence for 
large n the approximating polynomial S, exceeds the function x by 
about (9/100)7, that is, by about 9% of the difference of the limiting 
values of the function at the origin from the right and left. Thus the 
oscillating branches of the graph of S,(x) indeed overshoot the height 
of the graph of y(x) and exhibit the limit phenomenon described above. 

It is easily seen that the Fejer mean values of the sums S,,(z) are free 
from Gibb’s phenomenon. 
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*A.I.3 Integration of Fourier Series 


In general, as we have seen (p. 536), an infinite series can be integrated 
term by term if it is uniformly convergent. However, for Fourier 
series, we have the remarkable result that termwise integration is 
always possible. We state: If f(x) is a sectionally continuous function in 
—rn L x <7 having the formal Fourier expansion 


(e 0] 
hay + È (a, cos vx + b, sin vz), 
v=1 
then for any two points £, £z, 


zr 2 o0 ep) 
| f(x) dx =| ta,dx + (a, cos vx + b, sin vx) dz, 


v=lv z1 
or the Fourier series can be integrated termwise. Moreover, the series 
on the right converges uniformly in x, for fixed x,. 

The remarkable part of this theorem is that not only do we not need 
to assume the uniform convergence of the series but also we do not 
even need to make use of its convergence. 

To prove the theorem, define as on p. 606 


F(x) = f o= taii 


F(x) is continuous and has a sectionally continuous derivative; more- 
over, it satisfies the condition F(z) = F(—7) = 0, so that it stays 
continuous when it is periodically extended. Thus the Fourier series 


4Ay + D(A, cos vx + B, sin vz) 
v=1 
of F(x) converges uniformly to F(x). Using integration by parts, we 
obtain for v Æ 0, the values 


a=] F(d)cos vt dt = — + | po ae 
L a 7 


TT 


bd 


b, 
y 


B, =1f (sin tat = + f TO ey on 
T MT v-r v 


T J— V 
for the Fourier coefficients. Therefore the series 


F(x) — F(x) = > [A,(cos vz, — cos vx,) + B,(sin vz, — sin vz,)] 
v=] 


oO b a . ; 
=> |- — (cos vz — cos vx) + — (sin vz, — sin re) 
v=] L4 y 
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converges uniformly in x. Replacing F(x) by Í [ f(x) —4a] dx, we 
obtain the relation K 


Í [f(x) — ao] dx = > A cos vx + b, sin yx) dx 
v1 v=] 


zı 


as was asserted. 


Appendix II 


*A.II.1 Bernoulli Polynomials and Their Applications 
a. Definition and Fourier Expansion 


In the derivation of the Taylor series (p. 450) the polynomials 
P(x) = (x — &)"/n!, n > 1 in x with parameter ¢ played a role. The 
sequence of these polynomials is characterized by the conditions that 
every polynomial P,,,, is a primitive function of P,, that is, P(x) = 
P,(x), and moreover, P,,(&) = 0 and P(x) = 1. 

We now construct another remarkable sequence of polynomials, by 
successive integration, the Bernoulli polynomials, which we shall then 
extend as periodic functions and expand in Fourier series. 

The Bernoulli polynomials ¢,(x), for 0 < x < 1, are recursively 
defined by the following relations: 


(44a) Pn (2) = Pn-1(2), polz) = ] 
1 
(44b) Í },(x) dx = 0, forn > 0. 
0 
For known ġo ¢;,...,¢, condition (44a) determines ¢, within an 


arbitrary constant of integration; this constant 1s then completely 
fixed by the condition (44b). We see immediately by induction that 
¢, is a polynomial of the nth order with coefficients that are rational 
numbers. The first Bernoulli polynomials are easily calculated: 


g(x) = 1, 

o,(z) =x — $, 

p(x) = 32? — fart d, 

p(x) = 923 — ja? + hr, 

p(x) = ti — he? H ar — oto. 


For n > 1, we have by (44a, b) 
bD) — O =| (dt = 0 
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Therefore the polynomials ¢, may be extended from the basic interval 
0 < x <1 to all x as continuous periodic functions y,(x) with the 
period 1, the so-called Bernoulli functions, whereas the function y(x) 


coincides with the discontinuous function - (27x — 7) and [see 
formula (235), p. 592] can be represented as a Fourier series 
Gy yo) = — 1 (inet, Hinde sine.) 
7 1 2 3 
By means of successive integration, we obtain then 
_2_ $ cos 2nkt 


= (—1X7/2+1 
(45b) ,(t) = (—1) mre e 


, for even n, 


2 &sinÊ2rkt 
ae 


= (—1)\'"+))/2, 
(45c) = ,(t) = (—1) (27r) k=. k’ 


s for odd n. 


In the original interval 0 < x < 1 the periodic functions y,(t) are 
identical with the Bernoulli polynomials ¢,,(¢). 
For n, even y, is an even function, for n odd y, is odd; equivalently 


(45d) Yn(—2) = (—1)"p,(2). 


The constant terms in the successive Bernoulli polynomials form a 
noteworthy sequence of rational numbers 


(46a) b, = $,(0). 
y,(0) forn # 1, 
~ |-4forn = 1. 


We obtain immediately from the Fourier expansion 

(465) b, =0 for odd n = 3,5,... , 

(46c) b, =( —1)'/?)*?. ae 5 = ,  forevenn =2,4,.... 
(27r)” x=1 k” 


Furthermore, evidentally for even n = 2m, the signs of bem alternate. 

In place of the numbers b, which decrease rapidly with increasing n, 
Jacob Bernoulli introduced the following somewhat more suitable 
numbers: 


(47) Br = (—1)™ (2m)! bam; 


which we call the Bernoulli numbers. (That the numbers B,,,* = 
(—1)""1B,, are identical with the Bernoulli numbers introduced on 
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p. 562 will become apparent later on.) In particular, 


1 1 1 1 
B = -, B = —, B =~, B = 9 
‘6 * 30 ey t 30 
poe Jell gel 
66 2730 6 
As a consequence of formula (46c), we have in 
(e 6) 1 E (27)*" 
48 — = (—1)""1(27)?"4b,, = B, 
(48) 2 je OY On bbe = 


an explicit representation of Riemann’s ¢-function ¢(s) for integers 
s = 2n (see p. 560) by known numbers. For example, we obtain such 
striking formulas as 


1 1 1 nr 
l+ae-4+-4+-4+°°°32-=€022 
jr ag 6 (2) 
and 
1 1 1 a 
1+-+-4-4+:°°°=—= (C4). 
Oe. Be Ae 90 (4) 


As n — œ, the numbers b„ and B, tend to zero and infinity, re- 
spectively. For, first of all, we have 


Therefore 
2(277)-?" < [ban] < 2r). 


Since 27 > 1 and (2r)?” — 0, when n— œ, we have b „— 0, 
whereas b,,,, = 0. Furthermore, 


B, = (2n)! [ban] > 2(2n)! (2m); 


as is seen easily, the right-hand side tends to infinity. 


*b. Generating Function; the Taylor Series of 
the Trigonometric and Hyperbolic Cotangent 


The Bernoulli numbers and polynomials lead in an elegant manner 
to the Taylor expansion of the cotangent and related functions. These 
expansions follow most easily by means of the so-called generating 
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function of the Bernoulli functions, namely, the function 


(49) F(t, 2) = È valde" 


This is the power series in z whose coefficients are the Bernoulli functions 
of the parameter ¢. On the basis of the Fourier expansion of Eqns. 
(45) we have the estimate 


2721 2712 
val! S sdb S Ee 
m 4 


= < 
3(2m)" ` (27) 


for all ż£, and n > 2; hence the absolute value of the nth term of the 
series for F(t, z) is less than 4(|z| /27)". Thus for all t the radius of con- 
vergence of the power series in z is at least 277, as one sees by comparison 


Since for a fixed z with |z| < 27 the series for F(z, t) has a convergent 
majorant series, independent of 1, it follows from the general theory 
(see p. 535) that the series converges uniformly for all ¢. Thus it can be 
integrated termwise in this domain; it can also be differentiated 
termwise if the resulting series is also uniformly convergent. We use 
this fact to determine an explicit formula for F(t, z) (see p. 539). Term- 
wise differentiation with respect to ¢ yields formally for O <t< 1 
(for t = 0 ort = 1, y,(t) has no derivative). 


d co 
— F(t, 2) = > Prl)?” 
dt n=1 
=2 È Prali)" 


= Sy. 
= 2F(t, 2). 


This series has the same form as the original and is certainly uniformly 
convergent, so that the termwise differentiation was justified. Hence 
for every fixed z with |z| < 2m and for 0 <t<_1, the generating 
function F(t,z) obeys the differential equation dF/dt = zF(t, z). The 
general solution of this differentiated equation is F = ce*‘, where c is 
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a factor whose value depends on z as parameter (see p. 223). To 
determine c, we. integrate the series for F(t, z) with respect to t between 


0 and 1: 
1 1 
f F(t, z) dt = ef e” dt 
0 0 
e — 1 
=C 
2 
Š ary i)dt 
0 n=0 
=] et y(t) dt = 1. 
n=1 0 
Consequently, c = z/(e? — 1) and so we obtain the final results 
zt 
(50) Pirja 


e — | 


Letting ¢— 0 in this expression, we obtain the Taylor series for the 
function z2/(e? — 1): 


lim F(t, )=—— = 1+ 5,2" 


1-0 = n=1 


Since b, = —}4, adding 4z to both sides yields 


(51) : -+2= 1+ 5 5,2" 


e n=2 

Incidentally, this formula shows that the numbers B,* = n! 5, are the 
Bernoulli numbers introduced on p. 562. Since b, = 1 and b, = 0 for 
odd n, we have 

et + | z_ z e4 g77!2 
e — 1 2 5 g?!2 _ eo 2!2 
z 2 cosh fz 
2 2sinh 4z 


(52) = 


= kz coth 4z = X by, 2" = > — 2”, 
t age 2 (2n)! 
Thus we obtain the Taylor series for the hyperbolic cotangent already 
given on p. 563; the Taylor coefficients are simply related to the 
Bernoulli numbers; we have proved now that the expansion holds for 
all |z| < 27. 
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Similarly, we obtain the Taylor series for the ordinary (trigono- 
metric) cotangent. We begin for |z| < 27,0 < ¢ < 1, with the generat- 
ing function 


(53) G(t, 2) = È (=1)" vant" 


differentiating twice, we find that G satisfies the differential equation 
a’G/dt? + 22G = 0, whose general solution is G = a cos (zt) + b sin (2t), 
with a and b not depending on ż but possibly on z as a parameter. To 


1 
determine a and b we use two conditions. The first, | G(¢,z) dt = 1, 
0 


is found through termwise integration. The second, 


lim dG(t, 2) = $z? 
t—0 dt 


for all z, is found by termwise differentiation, in which we use the fact 
that for n > 1, 


Pan (0) = Pon-1(0) = bon- = 0. 


These conditions imply 


so that for |z| < 27,0<t< 1 

z cos (zt — 2/2) 
2 sin (2/2) 
We leave the details to the reader. 


If we let ¢ + 0 in this formula, we obtain the Taylor expansion of the 
cotangent (see p. 563) for |z| < 27 


G(t, z) = 


(54) G(0, 2) = ¥ (—1)"bo_2® = 4e cot Jz. 


c. The Euler-Maclaurin Summation Formula 


In Section 5.4b we derived Taylor’s formula using successive in- 
tegration by parts. In the following analogous derivation of a famous 
formula of Euler, the Bernoulli polynomials, or rather their periodic 
extensions y,,(t), take the previous place of the polynomials (t — b)”/n!. 
(We thus replace a and b from p. 450 by 0 and 1, which is always 
possible by means of the transformation of the variable ¢ into the 
variable s = (t — a)/(b — a), and is therefore not an essential change.) 
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Instead of beginning with the relation 


(55) (0) -FO=[ oa 


which would correspond to our previous derivation of the Taylor 
formula, we begin with the relation 


1 1 
(56) f f(t) dt =| f(t) wo(t) dt, 
which leads to greater symmetry. Since 


Pot) = y, (2), yı(+0) = — 
and y,(1 — 0) = 4, the formula for integration by parts 


1 1 1 
u dv = uv| — | vdu 
0 0 0 


for u oe v = y(t), f0) = fo, fC) = f yields (see also Chapter 3, 
p. 278) 


ie f(t) dt = fo + fh) -f f'(t)y,() dt, 


or 
(57) Ufo +f) -Í f(t) dt + F'(t)y(y) dt, 


an explicit expression for the deviation of the sum on the left from the 
1 

integral f S(t) dt. 
0 


Since a corresponding formula holds true for every interval between 
two successive integers due to the periodicity of y,(t), we immediately 
obtain 


(58) 4fothithet ii Sfat thn 
=|" pede + |" Feme) ae, 
or for any interval a < x < b, with a and b integers, 
(58a) fat Sarto tet 
=| fede +] Fee de, A = fo 


Thus we obtain an exact expression for the difference between the left- 
hand sum (the area of the inscribed rectangles in the case of an 
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increasing function) and the first term on the right-hand side (the area 
underneath the curve); formula (58a) is the simplest formulation of 
the Euler-Maclaurin summation formula. 
It is natural to improve upon this result by repeating the integration 
b 


by parts. Integrating the expression f f'(x)yi(x) dx, and setting 
u = f'(x), dv = y (x) dz, we obtain s 
b b b 
[ rov@d = r| -[ E az 


Since po(b) = yla) = Y0) = bs, 
the first term takes the form 
baf ($) — f (a)]; 
the second term can be again integrated by parts, yielding 
b 
-bl = SO + Sye ae 


Here, since b, = 0, the first expression vanishes; we again integrate 
by parts, obtaining 


bl f(b) — fa] -| F(x) pax) dz. 


Repeating this operation, until we reach wo,, we obtain the general form 
of the Euler summation formula 


(59) fat for too tho =Í f(z) dx — $If(b) — f (a)] 


+ S baal f(b) — F70] + Re 


where the remainder R, may be written in one of the two forms 


(60) R, = - | sean dx, 
OT i 
(60a) Ry =| f ETD (a2) Wop (T) dx. 


d. Applications. Asymptotic Expressions 


Convergent Expansions. Euler’s summation formula can be applied 
in different circumstances. First, if R,— 0 as k — oo, then the infinite 
series 


E baal f(b) — f(a) 
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converges, and the formula gives an important means of expressing the 
sum of the corresponding series in closed form, or for expressing 
definite functions as series. 


Nonconvergent Expansions. Secondly, and of more importance, the 
remainder R, may not tend to zero as k — œ; the above series does 
not necessarily converge. Nevertheless it may happen that at first the 
absolute values |R,| decrease with increasing k, and that |R,| for suit- 
ably chosen values of k is very small, whereas |R,,| begins later (for large 
k) to increase strongly. In this case the summation formula can be an 
important tool for numerical computations; although it is not possible 
to obtain arbitrarily high precision, as with convergent series, we can 
nevertheless compute the value of the left side to within an error which 
is at most equal to the least value |R,|, which is often a highly satis- 
factory precision. We shall examine examples of both these phenomena. 


Example. Exponential Functions. We consider first the function 
f(x) = e” for some fixed z. With a = 0 and b = 1, we obtain for any 
number k, the relation 


fixe f f(x) de — Hf) — f0) 


+ Š dogl f(b) — fa] + Ry 


n=1 


Consequently, 


Bn k 
ELL yet = 1) + È dant Met — 1) + R, 


n=1 


zZz k 
Z e 1 : f = ; as > baat” a Ry, 
n=] 


Z 


where 
1 
R, = -Í ze y(x) dx. 
0 


Since |po,(x)| < 4/(27)™ (p. 622), it follows that 


4 |z] 2k 
R,| < |z|*- el?! - = 4e (2) ; 
Rel < lz = = 


or R, — 0, at least for |z| < 27. Consequently, for these values of z, we 
can allow k to grow beyond all bounds in the summation formula, 
obtaining 


(61) os La pt È bn” 
n=1 
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for the function z/(e? — 1), a formula already found by other methods 
(p. 623). We note that the interval of convergence is again |z| < 27. 


e. Sums of Powers; Recursion Formula for Bernoulli Numbers 


An even simpler example of a convergent Euler summation formula 
occurs when the series on the right contains only finitely many terms, 
especially if f(z) is a polynomial of rth degree with r > 1, so that 
f(e) vanishes identically. We choose f(x) = x’, a = 0, b = n, and 
k > r. For simplification we again introduce the sequence B,* of 
Bernoulli numbers, defined previously (p. 623), as B,* = n! 5,, for all 
n. Noting that 


By* = 1, B,* = —}, B;* = By* = B,* =--- = BY 


2n+1 


= 0, 
we see that Euler’s formula (59) takes the form 
1+ 27+ 37+°-:'-+(n—1) 
n T * 
=| x’ dx ma D = : (f°(n) — fH) 
0 


v=1 VY, 
cl r * 
= Á +y Č rr — 1) e (r — v + 2n 
r +1 v=] y! 
= I [nr + 5 (’ T ‘| aaa 
r + 1 v=] v j 
1 {< 4 F ' (rin) x | 
= nee (B® — Bai | 
r+ 1 2 V ie 


This formula can be written symbolically as 
(62) 1 + 27 + 3r + és ere + (n _— 1)’ = a {(n + B*y+! <e Beer 
r 


where the term within the parentheses is to be expanded formally by 
using the binomial theorem, and each of the “powers” B** is to be 
replaced by the corresponding Bernoulli number B,*. For example, 
1+ 2? 4 32? +--+ + (n — 1)? = 3(n? + 3n°B, + 3nB,) 
= 3(2n® — 3n* + n) 
14+ 234+ 33+---+(n— 1)? = n?n — 1)? 
(cf. p. 58). 
By setting n = 1, formula (62) assumes the form 


1 
—— (1 + B* PPE Brrr = 0, 
C+ ) 
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or 
(62a) (1 + BT = Ber for allr = 1. 


which is just the recursion formula for the B,* given on p. 562. 


f. Euler’s Constant and Stirling’s Series 


An example of an application of the Euler-Maclaurin formula in the 
second case, that of divergence, is given by the function f(x) = 1/2 
witha = 1, b =n. By (58a) 


(63) 14+44+i44-: ipet -f+ (| PD) gy 
i, x 


n— 1 I% 2 n 


=logn+4-4-[ Vil) y 
2n Ii. a? 
or 


1 1 Éi 
1+4+itł+ +t logn=4 +>] A gy 
n 2n 1 2 


For n — œ the integral on the right side converges, since |p,(7)| < 4 
for all x; thus the absolute value of the integrand is always less than 


that of the convergent integral Í dx|x?. Hence we obtain in the 
relation j 


(64) lim [5 t- iogn|=1- | B@ae = 
1 T 


n> © 


a definite constant C, the Euler constant, already introduced on p. 526. 

We have then two results: The harmonic series is of the same order 
of growth as the logarithm, both diverging to infinity, and there is an 
explicit expression for the difference between the two 


2 logn—C=R, = +4] HO ay 
= sae) 


x 


i 
1k 
We note that R,, vanishes for n — œ at least of first order. 


We obtain a more important application when we set f(x) = log x, 
a = 1, b = n in formula (59), p. 626. Then 


log i +log2+---+log(n—1)=nlogn—n+1—4logn 


k 
— Š bmm = D! (1 = Ea) + | GE yn le) ae 
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Adding log n to both sides, we obtain 
2 ! 
(65) logn! =(n + }logn— n +c, + 2 EEE bon — Tel 


where 


k 
=s 2 bom(2m — 2)! +] =" Pox+ı(2) dx 


r,(n) =|" a eH E omla) dx. 


The improper integrals converge for k >0, since the functions 
Yox41(%) are periodic, and hence bounded for all x (see p. 307). We can 
find the value of the constant c, if we observe that by (65) for n — œ 


li n! e” 
Ck = im log yari . 


n> co 


We conclude then from Stirling’s formula (14), p. 504 (or directly from 


Wallis’ product for z as on p. 280) that c, = log V 2m. If we still express 
the Bernoulli numbers bọ» as (—1)"7B,,/(2m)! (see formula (47), 
p. 620), we obtain the so-called Stirling series 


log (Ss) -ý HED Bn r(n). 


[2a nte” gata 2m(2m — 1)?" 


This formula is a refinement of Stirling’s formula. For any fixed 
positive integer k and large n the terms in the sum approach zero 
respectively of the order of I/n, 1/n3, 1/n®,... , 1/n?*-1 The re- 
mainder term r,(n) approaches zero like 1 /n*, since wo, (x) is a bounded 
function. Thus for fixed k and very large n each term in the sum will 
be very large compared to the following terms, and the remainder will 
be smaller than all the terms in the sum. We thus obtain an approxi- 
mation formula of the form 


B, 1 Bi Bz 1 

66) lo ee eee eee 
( ) g Vrn” +! 2e" 1-2n 3 4 n? 5. 6n? 
1 1 11 141 1 1 

ee Oe ec es E ae Oe 


12n 360n®  1260n® 1680n? ` 1188n? 


This expansion must, however, not be considered in the same light 
as a convergent infinite series. It is only asymptotically correct in the 
sense that if we break off the series after a fixed number of terms, 
say k terms, then the error r, is small compared with all the terms 
kept provided n is sufficiently large. We can never make the error 
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arbitrarily small for a fixed n by taking more and more terms. As a 
matter of fact the infinite series (66) diverges, as we see immediately 
from the estimate on p. 621 for the Bernoulli numbers. For a given 
large n there is an optimum number of terms of the series which one 
might use. Thus for moderately large n we have the approximation 


n'a V2mn”+ entinen. 
for very large n the formula 


3 3 
n! ae y/2mn”t1/2e"+1/12n-1/360n 


gives a more accurate approximation, etc. 


PROBLEMS 


SECTION 8.1, page 572 


1. The fundamental period T of a periodic function f is defined as the greatest 
lower bound of the positive periods of f. Prove: 

(a) If T # 0, then T is a period. 

(b) If T # 0, then every other period is an integral multiple of T. 

(c) If 7 =O and if f is continuous at any point, then f is a constant 
function. 


2. Show that if f has incommensurable periods T} and 7>, then the funda- 
mental period 7 is zero. Give an example of a nonconstant function with 
incommensurable periods. 

3. Let f and g have fundamental periods a and b, respectively. If a and b 
are commensurable, say a/b =q/p, where p and q are relatively prime 
integers, then show by example that f + g can have as its fundamental period 
any value m/n, where m = aq = bp and n is any natural number. 


SECTION 8.5, page 598 


1. Obtain the Fourier series for the function f(x) = mx on the interval 
0 <x < I asa pure sine series and as a pure cosine series. 

2. Show how to represent a function defined on an arbitrary bounded 
interval as a Fourier series. 


3. Obtain the infinite product for the cosine from the relation 


sin 27x 


COS Tmt = = à 
2 sin 7x 


4. Using the infinite products for the sine and cosine, evaluate 


(a) 2-2.9.8.10.10.14 
1 3 5 7 9 a Tee eee! Ui eee 
4 


.2.4.8.10.14.18 
(b) 2 3 3 9 9 15 15° 


5. Express the hyperbolic cotangent in terms of partial fractions. 
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6. Determine the special properties of the coefficients of the Fourier 
expansions of even and odd functions for which f(z) = f(r — 2). 
SECTION 8.6, page 604 
1. Investigate the convergence of the Fourier expansion 


cos2x cos 3x 


cos x + > + 3 


of the function —log 2} sin 5 


SECTION 8.7, page 608 


1. Prove Parseval’s equation for a piecewise smooth function f where f 
may have a number of discontinuities. 


APPENDIX II.1, page 619 
1. Prove that 


TOPA N ate 
2. Prove for n > 1 that 
nlt) = (—1)"¢,(1 — 2). 


3. Using the expression for the cotangent in partial fractions, expand 


mx cot mx aS a power series in x. By comparing this with the series given on 
p. 625, show that 


> = = (-1)"—-1 (27)°™ 


2-(2m)! 
4. Show that 
5 1 B (—1)"-1(22™ _ 1)?” A 
you (2 — 1)?" 2(2m)! a 


5. Show that 


(—1)¥ = (-—1)"(22™ — 2)n?™ 
yom 2: (2m)! i 


6. Using the infinite products for the sine and cosine, show that 


sin x Z Ga) aes an 
a ef E) = — OE as 


im 


_ — (—1)%-122-1(220 — 1) Bx 
(b) log cos x = 2 T 


a2, 


7. Prove that 


1 n 
@ | 2B? de = 7? 


o f 2 La EEE 
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Differential Equations 
for the Simplest Types of Vibration 


On several previous occasions we have met with differential equa- 
tions, that is, equations from which an unknown function is to be 
determined and which involve not only this function itself but also 
its derivatives. 

The simplest problem of this type is that of finding the indefinite 
integral of a given function f(x): to find a function y = F(x) which 
satisfies the differential equation y’ — f(z) =0. Furthermore, in 
Chapter 3, p. 223, we showed that an equation of the form y’ = ay is 
satisfied by an exponential function y = ce®*, and we characterized the 
trigonometrical functions by differential equations (p. 312). As we saw 
in Chapter 4 (e.g., p. 405), differential equations arise in connection with 
the problems of mechanics, and indeed many branches of pure mathe- 
matics and most of applied mathematics depend on differential 
equations. In this chapter, without going into the general theory, we 
shall consider the differential equations of the simplest types of 
vibration. These are not only of theoretical value but are also ex- 
tremely important in applied mathematics. 

It will be convenient to bear in mind the following general ideas and 
definitions. By a solution of a differential equation we mean a 
function which, when substituted in the differential equation, satisfies 
the equation “identically”; this means for all values of the inde- 
pendent variable that are being considered. Instead of solution the 
term integral is often used: first, because the problem is more or Jess a 
generalization of the ordinary problem of integration; secondly, 
because it frequently happens that the solution is actually found by 
integration. 
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9.1 Vibration Problems of Mechanics and Physics 
a. The Simplest Mechanical Vibrations 


The simplest type of mechanical vibration has already been con- 
sidered in Chapter 4 (p. 404). We there considered a particle of mass m 
which is free to move on the z-axis and which is brought back to its 
initial position x = 0 by a restoring force. The magnitude of this 
restoring force we took to be proportional to the displacement zx by, 
in fact, equating it to —kx, where k is a positive constant and the 
negative sign expresses the fact that the force is always directed toward 
the origin. We shall now assume that there is a frictional force present 
also and that this frictional force is proportional to the velocity 
dx/dt = ż of the particle and opposed to it. This force is then given by 
an expression of the form —ra, with a positive frictional constant r. 
Finally, we shall assume that the particle is also acted on by an external 
force which is a function f(t) of the time t. Then by Newton’s funda- 
mental law the product of the mass m and the acceleration # must be 
equal to the total force, that is, the elastic force plus the frictional force 
plus the external force. This is expressed by the equation 


(1) më + ri + kz = f(t). 


This equation governs the motion of the particle. If we recall the 
previous examples of differential equations, such as the integration 
problem for 4 = dx/dt = f(t) solved by x = [f(t) dt+ c, or the 
solution of the particular differential equation më + kx = 0 on p. 405, 
we observe that these problems have an infinite number of distinct 
solutions. Here too we shall find that there are an infinite number of 
solutions, a fact expressed in the following way. It is possible to 
find a general solution or complete integral x(t) of the differential 
equation, depending not only on the independent variable ¢ but also on 
two arbitrary parameters c, and cz, called the constants of integration. 
Assigning special values to these constants we obtain a particular 
solution, and every solution can be found by assigning special values to 
these constants. 


This fact is quite understandable (cf. also p. 404). We cannot expect 
that the differential equation alone will determine the motion completely. 
On the contrary, it is plausible that at a given instant, say at the time 
t = 0, we should be able to choose the initial position x(0) = 2) and the 
initial velocity (0) = tọ (in short, the initial state) arbitrarily; in other 
words, at time t = 0 we should be ab'e to start the particle from any initial 
position with any velocity. This being done, we may expect the rest of the 
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motion to be completely determined. The two arbitrary constants c, and c} 
in the general solution are just enough to enable us to select the particular 
solution which, fits these initial conditions. In the next section we shall see 
that this can be done in one way only. 


If no external force is present, that is, if f(t) = 0, the motion is 
called a free motion. The differential equation is then said to be 
homogeneous. If f(t) is not equal to zero for all values of t, we say that 
the motion is forced and that the differential equation is nonhomo- 
geneous. The term f(t) is also occasionally referred to as the per- 
turbation term. 


b. Electrical Oscillations 


A mechanical system of the simple type described can physically 
be realized only approximately. An example is offered by the 
pendulum, provided its oscillations are small. The oscillations of a 
magnetic needle, the oscillations of the centre of 
a telephone or microphone diaphragm, and 
other mechanical vibrations can be represented 
to within a certain degree of accuracy by systems 
such as described. But there is another type #4 C 
of phenomenon which corresponds with great 
precision to our differential equation (1). This 
is the oscillatory electrical circuit. pl) 

We consider the circuit sketched in Fig. 9.1, Figure 9.1 Oscillator 

oe y 
having inductance u, resistance p, and capacity electrical circuit. 
C = 1/«. We also suppose that the circuit is 
acted on by an external electromotive force ¢(t) which is known as a 
function of the time ¢, such as the voltage supplied by a dynamo or the 
voltage due to electric waves. In order to describe the process taking 
place in the circuit we denote the voltage across the condenser by £ and 
the charge in the condenser by Q. These quantities are then connected 
by the equation CE = E/« = Q. The current J, which like the voltage 
E is a function of the time, is defined as the rate of change of the charge 
per unit time, that is, as the rate at which the charge on the condenser 
diminishes: J] = —Q = —dQ/dt = —E/«x. Ohm’s law states that the 
product of the current and the resistance is equal to the electromotive 
force (voltage); that is, it is equal to the condenser voltage E minus the 
counter electromotive force due to self-induction plus the external 
electromotive force ¢(t). We thus arrive at the equation Jp = E — ul + 
d(t) or —(p/K)E= E + (uld E + (0), that is, uË + pÈ + KE = 
—«@(t), which is satisfied by the voltage in the circuit. We see therefore 


p 
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that we have obtained a differential equation of exactly type (1). 
Instead of the mass we have the inductance, instead of the frictional 
force, the resistance, and instead of the elastic constant, the reciprocal 
of the capacity, whereas the external electromotive force (apart from a 
constant factor) corresponds to the external force. If the electro- 
motive force is zero, the differential equation is homogeneous. 

If we multiply both sides of the differential equation by —1/« and 
differentiate with respect to the time, we obtain for the current 7 the 
corresponding equation 


uÏ + pl + kI = (0), 


which differs from the equation for the voltage on the right-hand side 
only, and for free oscillations (¢ = 0) has identically the same form. 


9.2 Solution of the Homogeneous Equation. Free Oscillations 
a. The Formal Solution 


We can easily obtain a solution of the homogeneous equation (1) 
më + rz + kx = 0 in the form of an exponential expression, by deter- 
mining a constant A in such a way that the expression e+’ = x is a 
solution. If we substitute this tentative solution and its derivatives 
č = e+, # = ġe? in the differential equation and remove the common 
factor eċt, we obtain the quadratic equation. 


(2) m} +rA+k =0 
for A. The roots of this equation are 
1 = r 1 = 
y eee E A 2 at) 1S a ee 2 ; 
1 om + F Jr 4mk, 42 one Jr? — 4mk. 


Each of the two expressions x = e*1' and x = e?! is, at least formally, 
a particular solution of the differential equation, as we see by carrying 
out the calculations in the reverse direction. Three different cases can 
now occur: 


1. r? — 4mk > 0. The two roots A, and A, are then real, negative, 
and unequal, and we have two solutions of the differential equation, 


u =e! and u, = e% 


With the help of these two solutions we can at once construct a solution 
in which two arbitrary constants are present. For after differentiation 
we see that 


(3) T = Cit + Colle 
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is also a solution of the differential equation. In Section 9.3 we shall 
show that this expression is in fact the most general solution of the 
equation; that is, that we can obtain every solution of the equation by 
substituting suitable numerical values for c, and c}. 

2. r? — 4mk = 0. The quadratic equation has a double root. Thus 
to begin with we have, apart from a constant factor, only the one 
solution x = w, =e", But we easily verify that in this case the 
function 

= wee ee 


is also a solution of the differential equation.’ For we find that 


0 
r= ( — mie je g = p : — z) o TIENE 
2m 4m“ m 


and by substitution we see that the differential equation 


2 
PTR Tr ae E ioe 0 
m 


is satisfied. Then the expression 
(4) r= Geren + Ce ee 


again gives us a solution of the differential equation with two arbitrary 
constants of integration c, and cs. 

3. ? — 4mk <0. We put r? — 4mk = —4m?y? and obtain two 
solutions of the differential equation in complex form, given by the 
expressions v = u; =e "Pet and gz = u, = e"! ?m-ivt, Euler's 
formula 


ett = cos vt + isin vt 


gives us for the real and imaginary parts of the complex solution u,, on 
the one hand the expressions 


Uv, = TA cos rt, De = eT?” sin yt, 
and on the other hand, the representation 


u; +u Uy — Uy 
pee he 
z 2i 
From the second form of representation we see that v, and v, are (real) 
solutions of the differential equation. To verify this directly by 


differentiation and substitution is a simple exercise. 


1 We are led to this solution naturally by the following limiting process: if A, # 2z, 
then the expression (efit — eA2t)/(A, — A,) also represents a solution. If we now let 
A, tend to A, and write A instead of 4,, 42, our expression becomes d(e*)/dA = ter. 


638 Differential Equations for the Simplest Types of Vibration Ch. 9 


From our two particular solutions we can again form a general 
solution 
(5) £ = CV, + Coe = (C, COS vt + c sin vt)e"t/?™ 
with two arbitrary constants c, and c}. This may also be written in 
the form 
(6) x = ae—"!/?™ cos y(t — ô), 
where we have put c, = a cos vô, c, = asin vô, and a, 6 are two new 
constants. 


We recall that we have already met this solution for the special 
r = 0 (Section 5.4). 


b. Interpretation of the Solution 


In the two cases r > 2 mk and r = 2/mk the solution is given by 
the exponential curve or by the graph of the function re-""/?”, which for 
large values of t resembles the exponential curve, or by the superposition 
of such curves. In these cases the process is aperiodic; that is, as the 
time increases the “distance” x approaches the value 0 asymptotically 
without oscillating about the value x = 0. The motion therefore is not 
oscillatory. The effect of friction or damping is so great that it prevents 
the elastic force from setting up oscillatory motions. 


It is quite different for r < /2mk, where the damping is so small that 
complex roots A,, 4, occur. The expression x = a cos v(t — d)e~"/2” 
here gives us damped harmonic oscillations. These are oscillations which 


follow the sine law and have the circular frequency v = <p k|m — r?/4m?, 
but whose amplitude, instead of being 
constant, is given by the expression 
ae~"t/2m_ That is, the amplitude dimin- 
ishes exponentially; the greater the 
expression r/2m is, the faster is the rate 
of decrease. In physical literature this 
damping factor is frequently called the 
attenuation constant of the damped 
Figure 9.2 Damped harmonic oscillation, the term indicating that 
oscillations. the logarithm of the amplitude de- 

creases at the rate r/2m. A damped 
oscillation of this kind is illustrated in Fig. 9.2. As before, we call the 
quantity T = 27/y the period of the oscillation and the quantity vd the 
phase displacement. For the special case r = 0 we again obtain simple 


harmonic oscillations with the frequency = J k/m, the natural 
frequency of the undamped oscillatory system. 


r 
“x =acosy(t — d)e~ am! 
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c. Fulfilment of Given Initial Conditions. 
Uniqueness of the Solution 


We have still to show that the solution with the two constants c, and 
cC can be made to fit any preassigned initial state, and also that it repre- 
sents all the possible solutions of the equation. Suppose that we have 
to find a solution which at time ż = 0 satisfies the initial conditions 
x(0) = xo, #(0) = žo, where the numbers 2, and ž can have any values. 
Then in case 1 of Section 9.2a (p. 636) we must put 


Cy +F Co = Tos 


Ghr + GA = Zp. 
For the constants c, and c, we accordingly have two linear equations, 
and these have the unique solutions 


= Xo ee Acts = Ly a A, Xo 
Cy — 3 Co m 
As = Ai 


Ay B Ào 
In case 2 the same process gives the two linear equations 


C1 = Xo, 


r 


Åc + Co = Èo = ->), 


from which c, and c, can again be uniquely determined. Finally, in 
case 3 the equations determining the constants take the form 


a COS vÔ = Xp, 


PORSA r 
a(» sin vð — — cos vo) = Xo, 
2m 


J» 9 (- r J] 
v “xo + 12% + — Xp] |. 
2m 


Thus we have shown that the general solutions can be made to fit any 
arbitrary initial conditions. We have still to show that there is no other 
solution. For this we need show only that for a given initial state there 
can never be two different solutions. 

If two such solutions u(t) and v(t) existed, for which u(0) = xo, 
u(0) = 2% and 7(0) = zo, v(0) = žo, then their difference w = u — v 
would also be a solution of the differential equation, and we should 
have w(0) = 0, 4(0) = 0. This solution would therefore correspond to 
an initial state of rest, that is, to a state in which at time ¢ = 0 the 


with the solutions 


1 x 1 
ô = -arccos 2, a=- 
y a y 
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particle is in its position of rest and has zero velocity. We must show 
that it can never set itself in motion. To do this we multiply both sides 
of the differential equation mw + rw + kw = 0 by 2w and recall that 
2ww = (d/dt)w? and 2ww = (d/dt)w?. We thus obtain 


is (mw?) + 2 (kw?) + 2rw* = 0. 
dt dt 


If we integrate between the instants t = 0 and ¢ = 7 and use the initial 
conditions w(0) = 0, w(0), we have 


T 2 
mw*(r) + kw?(r) + 2r | (=) dt = 0. 
0 t 


This equation, however, would yield a contradiction if at any time 
T > 0 the function w were different from 0. For then the left-hand side 
of the equation would be positive, since we have taken m, k, and r to be 
positive, and the right-hand side is zero. Hence w = u — v is always 
equal to 0, which proves that the solution is unique. 


9.3 The Nonhomogeneous Equation. Forced Oscillations 
a. General Remarks. Superposition 


Before proceeding to the solution of the problem when an external 
force f(t) is present, that is, to the solution of the nonhomogeneous 
equation, we make the following remark. 

If w and v are two solutions of the nonhomogeneous equation, the 
difference u = w — v satisfies the homogeneous equation; this we see 
at once by substitution. Conversely, if v is a solution of the homo- 
geneous equation and v a solution of the nonhomogeneous equation, 
then w = u + v is also a solution of the nonhomogeneous equation. 
Therefore from one solution! of the nonhomogeneous equation we 
obtain all its solutions by adding the complete integral of the homo- 
geneous equation. We therefore need find only a single solution of the 
nonhomogeneous equation. Physically this means that if we have a 
forced oscillation due to an external force, and superpose on it an 
arbitrary free oscillation, represented by a solution of the homogeneous 
equation, we obtain a phenomenon which satisfies the same nonhomo- 
geneous equation as the original forced oscillation. If a frictional force 
is present, the free motion in the case of oscillatory motion must fade 
out as time goes on because of the damping factor e~"'’?". Hence for a 


1 Often called a particular integral or particular solution. 
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given forced vibration with friction it is immaterial what free vibration 
we superpose; the motion will always tend to the same final state as 
time goes on. 

Second, we notice that the effect of a force f(t) can be split up in 
the same way as the force itself. By this we mean the following: if 
f(t), f(t), and f(t) are three functions such that 


Ait) + fl) = f(t), 


and if x, = 2,(f) is a solution of the differential equation më + ra + 
kx = f,(t) and x, = x(t) is a solution of the equation më + rz + kz = 
f(t), then x(t) = x,(t) + x(t) is a solution of the differential equation 


(7) më + ri + kx = f(t). 


A corresponding statement, of course, holds if f(t) consists of any 
number of terms. This simple but important fact is called the prin- 
ciple of superposition. The proof follows from a glance at the 
equation itself. By subdividing the function f(t) into two or more 
terms we can thus split the differential equation into several equations, 
which in certain circumstances may be easier to manipulate. 

The most important case is that of a periodic external force f(t). Such 
a periodic external force can be resolved into purely periodic com- 
ponents by expansion in a Fourier series, and can therefore’ be 
approximated to as closely as we please by a sum of a finite number of 
purely periodic functions. It is therefore sufficient to find the solution 
of the differential equation subject to the assumption that the right-hand 
side has the form 

acos wt or b sin wt, 


where a, b, and w are arbitrary constants. 

Instead of working with these trigonometric functions, we can 
obtain the solution more simply and neatly if we use complex notation. 
We put f(t) = cett, and the principle of superposition shows that we 
need only consider the differential equation 


(8) mi + re + kx = ce’, 


where by c we mean an arbitrary real or complex constant. Such a 
differential equation actually represents two real differential equations. 
For if we split the right-hand side into two terms by taking, for 
example, c = | and write e’®! = cos wt + isin wt, then x, and zx, the 
solutions of the two real differential equations më + rz + kx = cos wt 


1 Provided that it is continuous and sectionally smooth (p. 604), which is the most 
important case in physics. 
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and më + rż + kx = sin wt, combine to form the solution x = x, + iz, 
of the complex differential equation. Conversely, if we first solve the 
differential equations in complex form, the real part of the solution 
gives us the function x, and the imaginary part the function z}. 


b. Solution of the Nonhomogeneous Equation 


We solve Equation (8) by a device suggested naturally by intuition. 
We assume that c is real and (for the time being) that r # 0. We now 
make the guess that a motion will exist which has the same rhythm as 
the periodic external force, and we accordingly attempt to find a 
solution of the differential equation in the form 


(9) t= oe! 


where we have only to determine the factor o, which is independent 
of the time. If we substitute this expression and its derivatives 2 = 
iwae’®', = —w*ae'” in the differential equation and remove the 
common factor e*®°* we obtain the equation 


—mw’o + irwo + ko =c 
or 


c 
(10) a R 
-mw + irw + k 
Conversely, we see that for this value of ø the expression ge*®* is 
actually a solution of the differential equation. To express the meaning 
of this result clearly, however, we must perform a few transformations. 
We begin by writing the complex factor ø in the form 
k — mœ? — ir | 
(11) d= ú ee = cae 1, 
(k — mw*)’ + r°w 
where the positive “‘distortion factor” « and the “phase displacement” 
wô are expressed in terms of the given quantities m, r, k, by the 
equations 


1 ; 
a = -a ee wd = rwx, cos wô = (k = mo*)a. 
(k — mw Y + rw 
With this notation our solution takes the form 


r= caet tÀ 

and the meaning of the result is as follows: to the force c cos wt there 
corresponds the “effect” ca cos w(t — 6), and to the force csin wt 
corresponds the effect ca sin w(t — 4). 
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Hence we see that the effect is a function of the same type as the 
force, that is, an undamped oscillation. This oscillation differs from 
the oscillation representing the force in that the amplitude is increased 
in the ratio « : 1 and the phase is altered by the angle wô. Of course, 
it is easy to obtain the same result without using the complex notation, 
but at the cost of somewhat longer calculations. 

According to the remark at the beginning of this section, by finding 
this one solution we have completely solved the problem; for by 
superposing any free oscillation we can obtain the most general forced 
oscillation. 

Collecting the results, we state the following: 


The complete integral of the differential equation 
më + rè + kx = ce’! 


(where x ¥ 0) is x = caeiol— + u, where u is the complete integral of 
the homogeneous equation më + rż + kx = 0 and the quantities « and 
ô are defined by the equations 


(12) 
Paran a e : 
a” = (cane ee sin wd = rwx, cos wd = (k — mu*)a. 
The constants in this general solution leave us the possibility of 
making the solution suit an arbitrary initial state, that is, for arbitrarily 
assigned values of 2) and 2, the constants can be chosen in such a way 
that 2(0) = x, and 2(0) = Zp. 


c. The Resonance Curve 


To acquire a grasp of the solution which we have obtained and of its 
significance in applications, we shall study the distortion factor « as a 
function of the “exciting frequency” w, that is, the function 


(13) go) = 

V(k — ma” + r'o? 

Such a detailed investigation is motivated by the fact that for given con- 
stants k, m, r, or as we say for a given “oscillatory system,” we can think 
of the system as being acted on by periodic exciting forces of very dif- 
ferent circular frequencies, and it is important to consider the solution 
of the differential equation for these widely different exciting forces. In 
order to describe the function conveniently we introduce the quantity 


w = V k/m. This number wọ is the circular frequency which the 
system would have for free oscillations if the friction r were zero; or, 
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briefly, the natural frequency of the undamped system (cf. p. 639). The 
actual frequency of the free system, owing to the friction r, is not equal 


tO wọ but is instead 
AE r° 
v= — ae s 
m 4m 


where we assume that 4km — r? > 0. (If this is not the case the free 
system has no frequency; it is aperiodic.) 

The function ¢(w) tends asymptotically to the value zero as the 
exciting frequency tends to infinity, and, in fact, it vanishes to the 
order 1/w?. Furthermore, ¢(0) = 1/k; in other words, an exciting 
force of frequency zero and magnitude one, that is, a constant force 
of magnitude one, gives rise to a displacement of the oscillatory system 
amounting to 1/k. In the region of positive values of w the derivative 
ġ'(w) cannot vanish except where the derivative of the expression 
(k — mw*)? + rœ? vanishes, that is, for a value w = w, > O for which 
the equation 

—4mo(k — mw) + 2r'w = 0 


holds. In order that such a value may exist we must obviously have 
2km — r? > 0; in this case 


i k r? / ec. AE 
w = ,§—- — = ,/ oa —-— . 
2m f 


Since the function ¢(@) is positive everywhere, increases monotonically 
for small values of w, and vanishes at infinity, this value œ, must give 
a maximum. We call this frequency œ, the “resonance frequency” 
of the system. 

By substituting this expression for w, we find that the value of the 
maximum is 


1 
r/(k/m — r?/4m?) 
As r —0, this value increases beyond all bounds. For r = 0, that is, 
for an undamped oscillatory system, the function ¢(w) has an infinite 
discontinuity at the value w = @,. This is a limiting case to which we 
shall give special consideration later. 

The graph of the function ¢(w) is called the resonance curve of the 
system. The fact that for w = w, (and consequently for small values 
of r in the neighborhood of the natural frequency) the distortion of 
amplitude « = ¢(w) is particularly large is the mathematical expression 
of the “phenomenon of resonance,” which for fixed values of m and k 
is more and more evident as r becomes smaller and smaller. 


plow) = 
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In Fig 9.3 we have sketched a family of resonance curves, all correspond- 
ing to the values m = 1 and k = 1, and consequently to w = 1, but with 
different values of D = 4r. We see that for small values of D well-marked 
resonance occurs near w = 1; in the limiting case D = 0 there would be an 
infinite discontinuity of ¢(w) at w = 1, instead of amaximum. As D increases 


o 1.4 


Distortion ——> 


Exciting frequency ——> w 


Figure 9.3 Resonance curves. 


the maxima move towards the left, and for the value D = 1/2 we have 
w, = 0. In this last case the point where the tangent is horizontal has moved 
to the origin, and the maximum has disappeared. If D > 1/ V2 there is no 


zero of ¢’(w); the resonance curve no longer has a maximum, and resonance 
no longer occurs. 


In general, the resonance phenomenon ceases as soon as the con- 
dition 
2km — r? <0 


becomes true. In the case of the equality sign, the resonance curve 
reaches its greatest height ¢(0) = 1/k at œ, = 0; its tangent is hori- 
zontal there, and after an initial course which is almost horizontal it 
declines towards zero. 
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d. Further Discussion of the Oscillation 


We cannot, however, remain content with the above discussion. 
To really understand the phenomenon of forced motion an additional 
point needs to be emphasized. The particular integral cae‘''—® is to 
be regarded as a limiting state which the complete integral 


a(t) = cae?) 4+ cu, + Colley 


approaches more and more closely as time goes on, since the free 
oscillation cu, + Cou superposed on the particular integral fades away 
with the passage of time. This fading away will take place slowly if r 
is small, rapidly if r is large. 

Let us suppose, for example, that at the beginning of the motion, 
that is, at time ¢ = 0, the system is at rest, so that x(0) = 0 and 
(0) = 0. From this we can determine the constants c, and c,, and we 
see at once that they are not both zero. Even when the exciting fre- 
quency is approximately or exactly equal to œw, so that resonance 
occurs, the relatively large amplitude « = ¢(@,) will not at first appear. 
On the contrary, it will be masked by the function cu, + cou, and will 
first make its appearance when this function fades away; that is, it will 
appear more slowly as r grows smaller. 

For the undamped system, that is, for r = 0, our solution fails when 
the exciting frequency is equal to the natural circular frequency 
Wy = V k/m, for then $(w 9) is infinite. We therefore cannot obtain a 
solution of the equation më + kx = e’®' in the form ae’®', We can, 
however, at once obtain a particular solution in the form x = ote’, 
If we substitute this expression in the differential equation, remember- 
ing that 

z= oe@(1 + iwt), ¢ = ce?(2iw — tw’), 
we have 
o(2imw — mw*t + kt) = 1, 
and, since mw? = k, 


Thus when resonance occurs in an undamped system we have a solution 
2imw 2i /km 


Using real notation, when f(t) = cos wt, we have 
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and when f(t) = sin wt we have 


We thus see that we have found a function which may be referred to 
as an oscillation, but whose amplitude increases proportionally with 
the time. The superposed free oscillation does not fade away since it is 
undamped; but it retains its original amplitude and becomes un- 
important in comparison with the increasing amplitude of the special 
forced oscillation. The fact that in this case the solution oscillates 
backward and forward between positive and negative bounds which 
continually increase as time goes on represents the real meaning of the 


infinite discontinuity of the resonance function for an undamped 
system. 


e. Remarks on the Construction of Recording Instruments 


In a great variety of applications in physics and engineering the discussion 
in the previous subsection is of the utmost importance. With many in- 
struments, such as galvanometers, seismographs, oscillatory electrical circuits 
in radio receivers, and microphone diaphragms, the problem is to record 
an oscillatory displacement x due to an external periodic force. In such 
cases the quantity x satisfies our differential equation, at least to a first 
approximation. 

If T is the period of oscillation of the external periodic force, we can 
expand the force in a Fourier series of the form 


œ 
PA= DY p, 
l=—o 
or, better still, we can think of it as represented with sufficient accuracy by 


N 

a trigonometric sum > y,e"(?7/1) consisting of a finite number of terms 
1=—N 

only. By the principle of superposition (p. 641), the solution x(t) of the 

differential equation, apart from the superposed free oscillation, will be 

represented by an infinite series’ of the form 


o0 
x(t) = > oet! rI TH 
l=- o 


or approximately by a finite expression of the form 


N 
l=-N 


1 Questions of convergence will not be discussed here. 
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By virtue of our previous results 


= —i6,(2 
o, = pe TRT) 


and 
PEA 1 : 2ml . = 2rlr 
a = par yp AT ecules ie ae 4r?l? 
k- ma) +PP Tik-m =| 


We can then describe the action of an arbitrary periodic external force 
in the following way: if we resolve the exciting force into purely periodic 
components, the individual terms of the Fourier series, then each com- 
ponent is subject to its own distortion of amplitude and phase displace- 
ment, and the separate effects are then superposed additively. If we are 
interested only in the distortion of amplitude (the phase displacement is only 
of secondary importance? in applications and, moreover, can be discussed in 
the same way as the distortion of amplitude), a study of the resonance curve 
gives us complete information about the way in which the motions of the 
recording apparatus mirror the external exciting force. For very large 
values of / or w[ =(27/T)/] the effect of the exciting frequency on the displace- 
ment x will be hardly perceptible. On the other hand, all exciting frequencies 
in the neighborhood of œw, the (circular) resonance frequency, will markedly 
affect the quantity x. 

In the construction of physical measuring and recording apparatus the 
constants m, r, and k are at our disposal, at least within wide limits. These 
should be chosen so that the shape of the resonance curve is as well adapted 
as possible to the special requirements of the measurement in question. 
Here two considerations predominate. First, it is desirable that the apparatus 
should be as sensitive as possible; that is, for all frequencies w in question 
the value of « should be as large as possible. For small values of w, as we 
have seen, « is approximately proportional to 1/k, so that the number 1 /k 
is a measure of the sensitiveness of the instrument for small exciting fre- 
quencies. The sensitiveness can therefore be increased by increasing 1/k, 
that is, by weakening the restoring force. 

The other important point is the necessity for relative freedom from dis- 


tortion. Let us assume that the representation f(t) = > y,e(@7/7 is an 
Loan 


adequate approximation to the exciting force. We then say that the apparatus 
records the exciting force f(t) with relative freedom from distortion if for all 
circular frequencies œ < N(27/T) the distortion factor has approximately 
the same value. This condition is indispensable if we wish to derive con- 
clusions about the exciting process directly from the behavior of the appa- 
ratus; if, for example, a recorder or radio is to reproduce both high and 
low musical notes with an approximately correct ratio of intensity. The 
requirement that the reproduction should be relatively “distortionless” can 


? Since, for example, it is imperceptible to the human ear. 
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never be satisfied exactly, since no portion of the resonance curve is exactly 
horizontal. We can, however, attempt to choose the constants m, k, r, of 
the apparatus in such a way that no marked resonance occurs, and also in 
such a way that the curve has a horizontal tangent at the beginning, so that 
p(w) = « remains approximately constant for small values of w. As we have 
learned above, we can do this by putting 


2km — r? =Q. 


Given a constant m and a constant k, we can satisfy this requirement by 
adjusting the friction r properly, for example, by inserting a properly chosen 
resistance in an electrical circuit. The resonance curve then shows us that 
from the frequency 0 to circular frequencies near the natural circular fre- 
quency w of the undamped system the instrument is nearly distortionless, 
and that above this frequency the damping is considerable. We therefore 
obtain relative freedom from distortion in a given interval of frequencies by 
first choosing m so small and k so large that the natural circular frequency 
w Of the undamped system is greater than any of the exciting circular 
frequencies under consideration, and then choosing a damping factor r in 
accordance with the equation 2km — r° = 0. 
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Distributive law, 2 
Divergent sequence, 71 
Divergent series, 75 
Domain, 18 


Electric circuit, 228, 583 
Electrical oscillations, 635 
Electricity, quantity of, 423 
Electromotive force, 228, 583 
Ellipse, 378 
area enclosed by, 370 
evolute of, 429 
length of, 437 
rational parameter representation, 
327 
Elliptic function, 300 
Elliptic integral, 299, 321, 378, 411, 
437, 550 
Energy, conversation of, 406, 420, 
421 
kinetic, 375, 420 
potential, 421 
Envelope, 424 
Epicycloid, 329 
Equation, algebraic, 103 
Error, calculus of, 490 
round off 486 
truncation 486 
Escape velocity, 417, 423 
Euler’s constant, 526, 629 
Euler's formula, 551 
Even function, 29 
Evolute, 359, 424, 427 
cusps of, 425 
of cycloid, 428 
of ellipse, 429 
Exponential function, 51, 151, 152, 
216, 249, 250, 453 
differential equation of, 223 
order of magnitude of, 249 
power series for, 546 
Extension, 24 
Extrapolation, 476 
Extremum, 238 
relative points, 238, 240 
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Factorial, 56, 308 
Falling bodies, 163, 169 
Fejer’s kernel, 610 
Fejer’s trigonometric approximation, 
610 
Fermat’s principle of least time, 245 
Fixed point, 500 
Folium of Descartes, 435 
Force, 397, 398 
elastic, 404 
resultant, 397 
Fourier coefficients, 587, 594, 604 
order of magnitude, of 607 
Fourier integral formula, 615 
Fourier series. 571, 587 
Fourier transform, 615 
Fractional part, 337 
Fractions. decimal, 9 
Frame of reference, 360, 364 
Free fall, 402 
Frequency, circular, 575, 582 
natural, 638 
of oscillation, 405 
resonance, 644 
Fresnel integral, 311 
Function, 17 
algebraic, 49 
analytic, 545, 553 
average of, 139 
bounded, 101 
concave, 357 
convex, 357 
composite, 217 
compound, 52, 217 
continuous, 98, 100, 101, 166 
differentiability, 166, 180, 259 
elementary, 86, 261 
elliptic, 300 
even, 29 
explicit, 261 
exponential, 51, 151. 
546 
gamma, 308 
“Holder-continuous,” 44 
hyperbolic, 228, 363, 552 
integrable, 128 
inverse, 45, 54 
limit of. 82 
linear, 48 
“Lipschitz-continuous,” 43 


216, 453, 


“Lipschitz-continuous,” monotone, 
29 
monotonic, 177 
odd, 29 


periodic, 336, 572 

periodic continuations of, 338 

primitive, 187. 189 

quadratic, 48 

rational, 47 

trigonometric. 49. 165. 274, 299, 
552 

vector, 393 


weight, 142 
Zeta, 559 
Fundamental theorem of calculus, 
185, 187. 188 


Gamma function 308 
Gauss’s test. 566 
Geometric mean, 16, 108 
Geometric series, 67, 68 
Gibb’s phenomenon, 616 
Graph, 19 
Gravitational acceleration, 413 
Gravitational constant, 413 
Gravity, 398 

center of, 142 
Guldin’s rule 374 


Harmonic. mean. 108 
series, 629 
simple, 405 
Harmonics, 577 
Hodograph, 438 
Holder condition. 44 
Holder-continuity. 44, 118 
Holder-exponent, 44 
Homogeneous equation, 636 
Hyperbola, area bounded by, 372 
rational representation, 293 
rectangular, 27. 231 
Hyberbolic cotangent. Taylor series 
for. 623 
Hyperbolic function, 228, 363 
addition theorem for, 231 
exponential expressions for. 552 
inverse 232 
Hypocycloid, 331, 435 


Identity mapping, 54 


Image, 19 

Impedance, 584 

Improper integral, 557 

Inclination, angle of, 341 

Incommensurable, 5 

Indefinite integral, 185, 188, 189 

Independent variable, 18 

Indeterminate expressions, 464 

Indeterminate forms, derivatives of, 

466 

Index, 431, 434 

Inductance, 583 

Induction, 57 
mathematical, 57 

Inequalities, 12 
Cauchy-Schwarz, 15, 197 
geometrical representation, 30 
triangle, 14, 612 

Inertia, moment of, 375 

Infimum, 97 

Infinite sequence, 56 

Infinite series, 75 

Inflection point, 237, 357, 460 

Inflectional tangent, 237 

Initial condition. 313, 399, 639 

Initial state, 634 


Instantaneous direction of motion, 


395 
Integrable function, 128 
Integral, 122 
additivity, 136 
analytic definition, 123 
bounds, 139 
computation of, 482 
Darboux, 199 
definite, 143 
Dirichlet. 311, 558 
elementary, 363 
elliptic, 299. 321, 437, 550 
Fresnel, 311 
improper, 301, 311, 557 
indefinite, 143, 185, 188, 189 
Leibnitz’s notation, 125 
of differential equation, 633, 634 
representation, 434 
Riemann. 128, 199 
sign, 125 
test for convergence, 570 
Integrand, 126 
Integration, by parts, 275 
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Integration, constants of, 634 
of rational functions, 282 
Intermediate value, property, 110 
theorem, 44, 100 
Interpolation, 470 
error of, 474 
linear, 182 
polynomial, 470 
Interval, 4 
closed, 4 
nested, 7 
open, 4 
Invariance, 360 
Inverse function, 45, 54 
derivative of, 206 
Involute, 426, 427, 429 
Irrational number, 6, 91. 106 
Isochronous, 411 
Iteration method, 499 


Jensen's inequality, 318 
Jump discontinuity, 31, 35 


Kepler’s third law, 415 
Kinetic energy. 375, 420 


Lagrange’s form for the remainder 
in Taylor's formula, 449, 452 
Lagrange’s interpolation formula, 
476, 477 

Leibnitz convergence test, 514 
Leibnitz notation for integral, 125 
Leibnitz rule. 203, 315 
Leibnitz-Gregory series, 445, 592 
Lemniscate. 102 

area in, 372, 379 
Length, 395 

alternative definition, 350 

as a parameter, 352 

invariance of. 350 

of curve in polar coordinates, 351 

of ellipse, 437 
L’Hospital’s rule, 464 
Limit, definition of 70 

left-hand, 573 

of a function, 82 

of a sequence, 60, 70, 93 

operations, 71 

point, 95 

right-hand, 573 
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Line integrals, 367 
Linear function, 48 
Linear interpolation, 182 
Lipschitz-condition, 43 
Lipschitz-continuity, 43 
of differentiable functions, 178 
Logarithm, 51, 185, 250 
addition theorem, 147 
any base, 153 
calculation of, 493 
expansion of, 442 
function, 145 
natural, 145 
order of magnitude of, 249 
Lorentz-transformation, 363 


Maclaurin’s theorem, 452 
Magnitude, order of, 248 
Majoranat, 521, 535 
Mapping, affine, 20 
identity, 54 
into, onto, 19 
one-to-one, 29, 54, 55 
perspective, 21 
Maximum, 238, 461 
absolute, 239, 240 
existence of, 101 
norm, 612 
relative, 238 
strict, 238, 243 
value, 240 
Mean, arithmetic, 16, 139 
arithmetic-geometric, 113 
geometric, 16, 108 
harmonic, 108, 109 
value, 141 
value theorem of differential cal- 
culus, 173, 191, 222 
value theorem of integral calcu- 
lus, 141 
value theorem of integral calcu- 
lus generalized, 142 
Minimum, 101, 238, 461 
absolute, 239, 240 
relative, 238 
value, 240 
Modulus, 104 
of continuity, 41, 178 
Moment, 373 
of inertia, 375 


Monotonic (monotone) function, 
29, 177 

Monotonic (monotone) sequence, 
74, 96 


Motion, circular, 415 
constrained to curve, 400 
equation of, 398 
forced, 635 
Newton’s law of, 397 
of falling bodies, 398 
on a given curve, 405 
oscillatory, 409 
uniform with velocity, 162 

Multiplication law, 153 


Natural frequency, 638 
Natural logarithm, 145 
Natural numbers, 1, 2 
Neighborhood, 12 
Nested sequence of intervals, 8 
Newton’s interpolation formula, 
471, 473 
Newton’s law of gravitation, 413 
Newton’s law of motion, 397, 400 
Newton’s method, 495, 502 
Norm, maximum, 612 
mean square, 612 
Normal, positive. 346 
to curve, 345 
Null sequence, 90 
Number, algebraic, 79 
axis, 3 
complex, 103, 104 
conjugate complex, 104 
continuum, 1, 7 
irrational, 6, 91 
Natural, | 
rational, 2, 106 
real, 7, 91 
transcendental, 79 


“O,” “o” notation, 253 
Odd function, 29 
Ohm’s law, 228, 584, 635 
Open interval, 4 
Operations, rational. 2 
with limits, 71 
Order, 92 
of points, 339, 340 
Order of magnitude, 248 


Order of magnitude, of a function, 
252 
of vanishing function, 252 
Orientation, 339 
counterclockwise, 342 
Orthogonal directions, 390 
Oscillation, 575 
amplitude of, 405, 411 
damped harmonic, 638 
electrical, 635 
frequency of, 405 
period of, 638 
Osculating circle. 460 
Osculating parabolas, 459, 476 


m, 80 
Wallis’ product for, 280 
Parabola, 28, 48 
cubical, 28 
Neil’s, 168 
osculating, 459, 476 
Parallel curve, 438 
Parallel displacement, 361, 380 
Parameter, change of, 326 
time as, 328 
Parseval’s equation, 632 
Parceval’s theorem, 614 
Partial fraction, 286 
Partial sum, 75 
Pendulum, cycloidal, 411, 428 
oscillation of, 410 
period of oscillation, 550 
simple, 410 
Period, 337 
of motion, 415 
of oscillation, 638 
of periodic function, 631 
of satellite, 416 
Periodic decimal fractions, 68 
Periodic functions, 572 
Perspective mapping, 21 
Phase, 575 
displacement, 575, 582, 638 
shift, 575 
Point of inflection, 357 
rational, 4 
Polar angle, 384 
Polar coordinates, 102 
area in, 371 
length of curve in, 351 
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Polynomials, 47 
interpolation, 470 
trigonometric, 577 
Postulates, 87 
Potential energy, 421-3 
Power series, 441, 450, 540, 554 
for exponential function, 546 
interval of convergence, 541 
Powers, sums of, 628 
with arbitrary exponents, 152 
Pressure atmosphere, 226 
Primes, 56. 111 
series of reciprocal, 561 
Primitive function, 187, 189 
Product, infinite, 559 
symbolic, 217 
Projection, stereographic, 21, 292 
Properties, in the large, 348 
in the small, 348 


Quadratic function, 48 
Quadrature, 482 


Radian measure, 50 
Radius of curvature, see Curvature 
Range, 19 
Rate of change, average, 160 
instantaneous, 160, 162 
Ratio and root tests, 521 
Rational functions, 47 
Rational numbers, 2. 106 
denumerable. 98 
Rational operations, 2 
Rational points, 4 
Real numbers, 7, 91 
binary representation, 11 
completeness, 95 
decimal representation, 9 
not denumerable, 98 
Rectangular hyperbola, 27 
Rectifiability, 349, 436 
Reflection law, 245 
Refraction law, 246 
Relativity, special theory of, 363 
Removable singularity. 35, 40, 453 
Resistance, 583 
Resonance, curve, 644 
frequency, 644 
Restriction, 24 
Riemann integral, 128, 199 
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Riemann sum, 128, 301 
Riemann zeta function, 621 
Rolle’s general theorem, 470 
Rolle’s theorem, 175 
Roots of unity, 105, 106 
Rotation, of axes, 392 

sense of, 341, 342 
Round-off error, 486 
Rule of false position, 497 


Scalar, 379 
Schwarz inequality, 197 
Secant of curve, 156 
Sectionally continuous, 588 
Sectionally smooth, 604 
Self-induction, 228 
Sense of rotation, 341 
on arc, 334 
positive, 334 
Sequence, bounded, 71 
convergent, 70 
divergent, 71 
infinite, 56 
limit of, 60, 93 
monotone, 74, 96 
nested, 90 
null, 90 
Series, absolutely convergent, 511, 
516, 518 
binomial, 456, 469, 547 
comparison of, 420 
conditionally convergent, 511, 
517, 518 
convergent, 75, 511 
differentiation of, 538 
Dirichlet, 568 
divergent, 51 
Fourier, 571 
geometric, 67, 68 
harmonic, 513 
hypergeometric, 567 
infinite, 75, 455 
integration of, 536 
majorants, 521 
of reciprocal prime numbers, 561 
operations with, 420 
power, 540 
rearrangement of terms, 518 
sum of infinite, 510 
trigonometric, 572 


Series, uniform convergence of, 
532, 534, 535 

Set, denumerable, 98 

Sgn x, 31, 35 

Simple pendulum, 401, 402, 410 

Simpson’s rule, 485, 487 

Sine, infinite product for, 602 
power series for, 455 

Slope of curve, 158 

Snell’s law of refraction, 247 

Span, 193 

Speed, 353, 395, 396 

Spring, 423 

Stationary point, 240, 347 

Stereographic projection. 21 

Stirling’s formula, 504, 630 

Stirling’s series, 630 

Subsequence, 96 

Substitution rule for integrals, 265, 

267 

Successive approximation, 495 

Summation, by parts, 516 
symbol, 75 

Superposition of vibrations, 576 
principle of, 641 

Supremum, 97 

Surface of revolution, 374 

Symbolic product, 52, 385 


Tangent, 460 

direction cosines of, 345, 346, 

354 

direction of, 395 

equation of, 344 

formula. 483, 486 

positive, 346 

to a curve, 156, 161 
Taylor’s formula. 446, 448, 449 
Taylor's polynomial, 447, 459 
Taylor’s series, 449, 545 
Taylor's theorem, 451 
Topology, 342 
Tractrix, 437 
Transcendental numbers, 79 
Translation, 361, 380 
Trapezoid, formula, 483 

rule, 487 
Triangle inequality, 14, 612 
Trigonometric function, 49, 215, 

299 
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Trigonometric function, differential Vectors, sum of, 385 
equation of, 312 unit, 390 
differentiation of, 205 Velocity, 395 
exponential expressions for, 552 average, 162 
inverse, 210 components. 361 
orthogonality relations of, 274 of freely falling bodies, 163 
representation, 104, 165 Vibration, 634 


Trigonometric polynomial, 577 
Trigonometric series, 571 
Trochoid, 332, 435, 437 
Truncation error, 486 


amplitude of, 575 

elastic, 404 

harmonic, 575 

period of, 575 

sinusoidal, 575 

superposition of, 576 
Voltage. 583 


Uniform continuity, 41, 100 
Uniform convergence, 529, 532 
Unity, roots of, 105, 106 


Vectors. 380 Wallis’ formula, 280, 282 
e Delweaa 387 Weierstrass approximation theorem, 
coordinate, 390 , 569, 608 o 
definition of. 380 Weierstrass principle. 95, 96 
derivative of, 393 Weight, factors, 142 
exterior product of, 388 function, 142 
integral of, 393 of body, 399 
length of, 382 Weighted average, 142 
opposite, 382, 384 Work, 418, 420 
parallel construction for sum of, diagram, 419 
385 
position, 382 Zeta function of Riemann, 559, 570, 
resultant of, 385 621 


scalar product of, 388 as infinite product, 560 


